home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Scene Storm
/
Scene Storm - Volume 1.iso
/
coding
/
asm_reference
/
amigaocs_hardware.doc
< prev
next >
Wrap
Text File
|
1995-09-01
|
208KB
|
4,792 lines
This is my hardware DOC file. It bears no relation to the stuff in any of
the Amiga manuals to my knowledge, but draws from the Amiga System Pro-
grammer's Guide by Abacus for much important material. Also, I include some
example pieces of code of my own, routines plus data structures, illustra-
ting how to manipulate the hardware directly.
List of Topics:
Hardware:Preamble
Hardware:The CIA Chips
Hardware:DMA
Hardware:Interrupts
Hardware:The Copper
Hardware:Bitplanes & Bitplane Control
Hardware:Sprite Management
Hardware:The Blitter
Hardware:Sound Management
Hardware:Disc Management
Hardware:Interfaces
Hardware:Mouse, Keyboard, Joysticks
Hardware:Some Notes
Hardware:Logic Tutorial (for mathematically skilled readers & masochists)
Hardware:Preamble
This DOC file is somewhat oddly organised. This is because several different
sections contain material that cross-references across topic headings. Where
a cross-reference exists, I'll signpost it using the # character followed
by a number. If you use Devpac to view this file, you can skip across the
cross-references using the search facility.
The main base addresses for the chips are $BFD000 and $BFE001 for
the CIAs, and $DFF000 for the gee-whizz custom chips. The custom chips are
organised as a linear block of word addresses from the base address onwards,
and the CIA register addresses occur at 256-byte intervals from the base add-
ress onwards. Other than that, there will occur extra pieces of information
relating the way that several of the chips are connected to various hardware
interfaces such as the keyboard and mouse. They are the main reason for the
implementation of the cross-referencing scheme in this file, because in the
case of the CIAs, they are connected to several different actual hardware
interfaces.
I find it best to use instructions of the form
MOVE.W #DATA,register(A0)
when accessing the chips, because there's less time overhead than with those
instructions using absolute addresses. In the above case, A0 contains the
base address of the chip set being accessed.
One more point. The 68000 CLR instruction issues a read access and
then a write access singal across its bus. Other 680x0 chips from the 68020
onwwards perform a single write access only. Because of this, using something
like CLR COPJMP1(A0) is not recommended, even though it's shorter than using
MOVE.W #0,COPJMP1(A0), when accessing strobe registers (see later for an ex-
planation of what these are). Those blessed with assemblers capable of gen-
erating 68020/68030/68040 code and inline coprocessor code for the 68881 and
68882 maths coprocessors should NOT use these facilities when accessing the
custom chips in particular if they're lucky enough to have a 680x0 acceler-
ator board fitted. Save these facilities for accelerator-board specific code
only. Also be warned about invalid cache entry on these processors if the DMA
from the custom chips scribbles all over memory when the accelerator board
processor is attempting an access at that address range!
Given the propensity of games writers in particular to tell AmigaDos
to take a hike, I issue the following caveats:using MOVE to SR to kill off
iterrupts doesn't work properly-the interrupts are handled by the 4703 custom
chip and generated anyway. All that happens is that the 68000 ignores them,
and the relevant bits in the 4703 control registers don't get handled proper-
ly. This can cause problems. Also, supervisor mode doesn't provide any real
advantages over user mode (although I'm guilty of abusing it sometimes). If
you're handling your own exceptions directly, take account of differences in
exception stack frames especially if you've got a 680x0 accelerator board! I
recently found out that a gee-whizz 68040 board is now available with 2 megs
of memory from a company in Pasadena. £2000 and it's yours. Also, if you are
the sort of programmer who pees around with the ROM, watch out!
One final point. To save typing out addresses of the form $DFFxxx in
full for custom chip register addresses, I'll refer to them via offsets from
the base address of $DFF000. It's a VERY GOOD idea to do this in programs as
well! Load, say, A5 with $DFF000 and access the register of your choice using
something like
MOVE.W #DATA,REGISTER_NAME(A5)
instead of littering code with absolute addresses (you'll also save 4 cycles
of processor time per access!). Any persons having encountered the dread set
of files called 'my_hardware.i' etc., which I have created for my own use &
recently distributed to others will find all of the custom chip registers in
name form defined as offset equates from the base address of $DFF000, and any
programs using these include files accessing the custom chips in the manner
above.
Hardware:The CIA Chips
There are two of these beasties on the Amiga. They're referred to as CIA-A &
CIA-B respectively. CIA-A lives at $BFE001 onwards, CIA-B at $BFD000 on. I
now present the register list. The formula for the address of each register
is register number*256+CIA base address, and all CIA registers are a
single byte wide.
Register Name
-------- ----
0 PRA (Port A Data Register) R/W
1 PRB (Port B Data Register) R/W
2 DDRA (Port A Data Direction Register) R/W
3 DDRB (Port B Data Direction Register) R/W
4 TALO (Timer A lower 8 bits) R/W
5 TAHI (Timer A upper 8 bits) R/W
6 TBLO (Timer B lower 8 bits) R/W
7 TBHI (Timer B upper 8 bits) R/W
8 E.LSB (Event counter lower byte) R/W
9 E.MID (Event counter middle byte) R/W
A E.MSB (Event counter high byte) R/W
B not used
C SP (Serial port data register) R/W
D ICR (Interrupt control register) R/W
E CRA (Control register A) R/W
F CRB (Control register B) R/W
Note:the 'serial port' on the CIAs is NOT the same as the Amiga serial port
at the back! This is an internal serial port, and on the CIA-A it is connect-
ed to the 6500/1 processor handling the keyboard (#1). The main serial port
for external communications is handled via Paula (#2).
Data registers:the CIA data registers are connected in a seemingly odd manner
to various different pieces of hardware. They can be treated bitwise as input
or output registers. If the corresponding DDRx bit is 0, the same bit on the
PRx is an input, and an output if the DDRx bit is a 1. Writing to a data reg-
ister stores the value in it, and reading it reads the state of those port
lines to which it is connected.
There is a simple handshaking mechanism via two lines, called PC and
FLAG. PC goes low for 1 clock cycle on each access to PRB. The FLAG input re-
sponds to such transitions. Every time the state of FLAG changes from 1 to 0,
the FLAG bit in the ICR is set. These two lines allow a simple form of hand-
shaking in which the PC and FLAG lines of the CIAs are cross-connected. The
sender need only write its data to the port register & then wait for a FLAG
signal before sending each additional byte. Since FLAG can generate an int-
errupt if wanted, the sender can perform other tasks while waiting for the
FLAG signal, and continue the sending via an interrupt routine. The same also
applies to the receiver, except that it reads the data from the port instead
of writing to it. CIA-A has its PRB connected to the parallel port (more cor-
rectly, the Centronics port data lines), and CIA-B has a whole series of awk-
ward connections for PRB as follows (Cross-reference #6):
Bit Connection
--- ----------
7 /MTR motor signal to disc drive
6 /SEL3 drive 3 select
5 /SEL2 drive 2 select
4 /SEL1 drive 1 select
3 /SEL0 drive 0 select
2 /SIDE drive side select
1 DIR data direction signal to disc drive
0 /STEP step signal to drive head stepper motor
Both CIAs have awkward connections for PRA. They are (CIA-A first, then
CIA-B):
Bit Connection
--- ----------
7 Game port 1 pin 6 (fire button)
6 Game port 0 pin 6 (fire button)
5 /RDY disc ready signal from disc drive
4 /TK0 disc track 00 signal from disc drive
3 /WRPO write protect signal from disc drive
2 /CHNG disc change signal from disc drive
1 LED power LED (0=on, 1=off)
0 OVL memory overlay bit (DANGER!!!!)
7 /DTR DTR signal from serial interface
6 /RTS RTS signal from serial interface
5 /CD CD signal from serial interface
4 /CTS CTS signal from serial interface
3 /DSR DSR signal from serial interface
2 SEL select signal for Centronics interface
1 POUT paper out signal from Centronics interface
0 BUSY busy signal from Centronics interface
When I said the connections were awkward, I wasn't kidding.
Control registers:the two control registers determine the mode of operation
of many of the other registers. CRA bit allocations are:
Bit Function
--- --------
7 Not used
6 SPMOD serial port register mode 0=input, 1=output
5 INMODE 0=clock, 1=CNT (see later)
4 LOAD 1=force load (strobe) for timer A
3 RUNMODE 0=continuous, 1=one-shot for timer A
2 OUTMODE 0=pulse, 1=toggle
1 PBON 0=PB6 off, 1=PB6 on
0 START 0=off, 1=on
CRB bit allocations are:
Bit Function
--- --------
7 0=TOD (time of day-see later), 1=ALARM
6,5 INMODE
00 = clock, 01=CNT
10 = timer A input,
11 = timer A + CNT
4 LOAD 1=force load (strobe) for timer A
3 RUNMODE 0=continuous, 1=one-shot for timer A
2 OUTMODE 0=pulse, 1=toggle
1 PBON 0=PB7 off, 1=PB7 on
0 START 0=off, 1=on
Timers:this is one of the complicated things. These timers can count down
from a preset value to zero. When reading, they are treated as TALO/TAHI
and TBLO/TBHI. When writing, they are treated as latches PALO/PAHI and
PBLO/PBHI. The latches should be loaded, low byte first, as a write access
to the high register causes the timer to be stopped and reloaded with the
latch value unless the LOAD bit in the control register is set, in which case
the latch value is transferred to the timers regardless of the timer state.
The LOAD bit in the CRA/CRB registers is a strobe bit. Writing simply causes
the given action to be performed-the bit value is not stored as such.
When the timer hits zero, the latch values are automatically put
back into the timer.
The latch is also known as the prescaler. The number of timeouts in
continuous mode is equal to the clock frequency divided by the value in the
prescaler/latch. Timer A is connected to the processor E clock, the frequency
being 716KHz.
Reading the timer can cause problems, because it has to be performed
in two separate read operations. One problem is that the timer could change
its state between the two reads. For example, if the timer bytes combined are
$0100 in value, and the high byte is read, the timer could decrement to $00FF
before the low byte is read, resulting in a final read value of $01FF, which
is incorrect (it should be $00FF). Stopping the timer from decrementing
during the read is one method of preventing this, but this is not the most
elegant way. Instead, read the high byte, then the low byte, then read the
high byte again in a separate register. If the two high byte reads are the
same, take this as the value. If not, repeat the process. It should only need
one repeat read to obtain the correct result, but in any case the following
code should obtain the correct value (code is for CIA-A):
LEA $BFE001,A0 ;CIA-A base address
get_timer MOVE.B $400(A0),D0 ;get TALO
MOVE.B $500(A0),D1 ;get TAHI
MOVE.B $400(A0),D2 ;get TALO again
CMP.B D0,D2 ;timer values match?
BNE.S get_timer ;repeat read if not
NOP ;here, got correct timer!
Bits 5 and 6 of the control registers control the timer functions. For timer
A, there are two possibilities. Either CRA INMODE=0, in which case timer A is
decremented during each clock cycle on the processor's E clock line, or else
CRA INMODE=1, in which case each high pulse on the CNT input line decrements
the timer.
Timer B has four possible operating modes, governed by the INMODE
bits of CRB. First there is INMODE mode 00, in which timer B decrements each
time a pulse occurs on the E clock line (connected to the f2 pin of the CIA)
just as for timer A INMODE=0. INMODE=01 decrements timer B every time there
is a pulse on the CNT line. INMODE=10 allows timers A and B to be combined to
form a 32-bit timer, and timer B decrements every time a timeout signal from
timer A is received. In this mode, timer A forms the low word of the 32-bit
timer, and timer B the high word. Finally, INMODE=11 allows the length of a
pulse on the CNT line to be measured. In this mode, timer A issues a timeout
when the CNT line is high. The timeout signals are in the interrupt control
register or ICR, to be dealt with later.
Two output modes for the timers can be selected with the OUTMODE bit
of the control registers. OUTMODE=0 causes timeout signals to appear as a
positive pulse one clock period long on the corresponding port line. When the
OUTMODE=1 (toggle mode), each timeout causes the corresponding port line to
change value from low to high or high to low. Each time the timer is started
in this mode, the output starts as a high signal.
The RUNMODE bit determines whether the timer operates in one-shot
mode (RUNMODE=1) or operates continuously (RUNMODE=0). In one-shot mode, the
timer stops after each timeout and sets the START bit to 0. In the contin-
uous mode the timer restarts after each timeout automatically.
Timer A of CIA-A is used by the operating system for communication
with the keyboard, and timer B by the operating system for some other tasks.
Timer A of CIA-B is used for serial data transfers, otherwise it is free, and
timer B is used to synchronise the blitter with the screen by the OS, other-
wise it is free.
Interrupt Control:this is actually implemented as two registers, one being a
read register and one being a write register. The bit allocations are given
in the table below, read register first. The read register is the ICR data
register, the write register the ICR enable mask register.
Bit Function
--- --------
7 IR - Interrupt received/signalled
6 not used, always 0
5 not used, always 0
4 FLAG - PRB port handshaking
3 SP - serial port needs attention
2 Alarm signal
1 TB - Timeout for Timer B
0 TA - Timeout for Timer A
7 Set/Clr bit (explained below)
6 not used, send a 0
5 not used, send a 0
4 FLAG input enable
3 SP - serial port interrupt enable
2 Alarm input enable
1 Timer B timeout enable
0 Timer A timeout enable
The set/clr bit is used as follows. Let us assume that the byte to be written
to the ICR is %00010011. Bit 7 (the set/clr bit) is 0. This tells the input
latch for the ICR to clear all bits in the ICR write register corresponding
to the 1 bits in the byte just written to it, and to leave all other bits un-
changed. This particular write byte has bits 4,1 and 0 set, so the FLAG, TB
and TA interrupt inputs are disabled. If the byte was %10010011 instead, the
set bit 7 (corresponding to set/clr on the ICR) would tell the input latch
for the ICR to set all bits corresponding to the other 1 bits in the byte
just written, and leave all of the others unchanged. This write byte would
enable the modes disabled by the previous write byte example.
Note that this set/clr mechanism occurs throughout the Amiga chip
set, in particular for DMA control (#3), main system interrupt control (#4)
and other functions.
Having written a control byte to enable various interrupt sources
(CIA-local interrupt sources, NOT main system ones!), we can now read the
same address and the chip will return the value of the ICR data register,
which contains bitwise information about which interrupt source triggered
the CIA's internal interrupt mechanism. Note that any interrupt handled via
the CIAs is further passed onto the system as a whole, to allow 68000 auto-
vectored interrupt code to handle the interrupt. More about this later.
If this value is needed for multiple interrupt source tests, then it
must be saved, because reading the ICR data register causes the CIA to clear
it after the read. The IR bit is the interrupt request bit, and if set, indi-
cates that a valud CIA internal interrupt was triggered. But this and all of
the other ICR data bits are cleared upon reading, so if you fail to save the
result of the read (this code saves the read value);
LEA $BFE001,A0 ;CIA-A base addr
MOVE.B $D00(A0),D0 ;get ICR data reg value
MOVE.B D0,last_icra_read(A6) ;save in my variable block!
and then obliterate the unprocessed bits without saving as above, then you
have lost forever those unprocessed bits. You have been warned!
The CIA signals an interrupt as follows. Whenever one of the various
interrupt sources sets its corresponding bit in the ICR data register, the
CIA checks to see if the corresponding ICR mask enable bit is also set. If
this is so, the CIA pulls the IRQ line low to signal the interrupt in hard-
ware, and then sets the IR bit (bit 7) in the ICR data register to signal the
interrupt in software also. How the main system treats this signal is handled
later. The IRQ line does not return to the high state (its normal state) un-
til the ICR data register is read (and hence the IR bit, and all others, are
cleared by the CIA).
Event Counter:this event counter differs from the 6526 Time-Of-Day counter
found, for example, in the Commodore 64 (yeuk!). This is almost the only
difference between the 6526 and the 8520 CIA. Those unfortunate enough to
have encountered the 6526 in past programming will breathe a sigh of relief.
There is one slight problem with the documentation that Commodore supply-
they insist on referring to TOD (time of day) in the literature, even though
this term is meaningless here.
Instead of the real-time clock which counts hours, minutes and sec-
onds on the 6526, the 8520 has a simple 24-bit event counter. It takes its
input signal on the TOD line (jeez, Commodore!) of the chip. The event coun-
ter starts at zero (or some other predefined state written to it) and counts
UPWARDS to $FFFFFF, before returning to zero. The event counter consists of
the counter proper and a latch register. When the high byte is read, the ac-
tual counter state is transferred completely to the 24-bit latch, and the
high byte of the latch returned. The counter continues counting undisturbed
while the remaining latch registers are read, mid-byte next, then low-byte.
ALWAYS read in the order high, mid, low or else this won't work!
Writing a value to the event counter causes the CIA to stop the
counter until the entire value is written, PROVIDED that it is written to
in the same order as it is read from above, namely high, mid, low. Once
the low byte is written, the timer starts up again, with the written value
as the start value. When writing the event counter value in normal mode, be
sure that the TOD/ALARM bit of the CRB control register is cleared!
An alarm function exists also. SET the Alarm bit in CRB as opposed
to clearing it as for normal event counter writing, and write a value into
the event counter. The chip will take this value as an alarm value, and if
the event counter value ever matches this alarm value, the alarm bit of the
interrupt control register is set. The value of the alarm setting CANNOT be
read, only written - reading the event counter always returns the current
counter latch value, regardless of the state of the TOD/Alarm bit in CRB. So
if you want to provide known alarm values, save them somewhere safe before
using them!
Serial Port:the serial port consists of the serial data register (which is
readable and writeable) and the shift register (not directly accessible).
Setting SPMODE=0 in the control register sets the serial port to input mode
and SPMODE=1 to output mode.
In input mode, the serial data on the SP line are shifted into the
shift register after each rising edge on the CNT line. After 8 CNT pulses the
shift register is full, and the data is transferred to the serial data reg-
ister. At the same time, the SP bit in the ICR data register is set. If more
CNT pulses are received, the data continues to shift into the shift register
until it is full again. If the user has read the serial data register before
this, the data is transferred across again, else it is lost. When using this
register for input, respond to it reasonably promptly!
In output mode, timer A is used to determine the send frequency. The
timeout rate of timer A (which must be operated in CONTINUOUS mode) controls
the baud rate of the transfer. The data are shifted out of the shift register
at half the timeout rate of timer A, whereby the maximum output rate is 1/4
of the clock frequency of the 8520.
The transfer begins after the first byte is written to the serial
data register. The CIA transfers the byte into the shift register. The
individual data bits now appear at half the timeout rate of timer A on the
SP line and the clock signal from timer A appears on the CNT line (it changes
value on each timeout so that the next bit appears on the SP line on each
negative transition [high to low] ). The transfer begins with the most sig-
nificant bit of the data byte. Once all 8 bits have been output, the CNT line
remains high and the SP line retains the value of the last bit sent. In add-
ition, the SP bit of the ICR data register is set to show that the serial
data register can be supplied with new data. If the next data byte was loaded
into the data register before the output of the last bit, the data output
will continue without interruption.
To keep the transfer continuous, the serial data register must be
re-supplied with fresh data at the proper time.
The SP and CNT lines are described as 'open-collector' outputs. This
allows the outputs of multiple CIAs to be connected together. See any good
text on electronic interfacing techniques for an explanation (especially the
mass of material on the IEEE-488 bus) of open-collector outputs.
On the Amiga, the SP line of CIA-A is connected to the KDAT line of
the keyboard 6500/1 processor, and the CNT line connected to the KCLK line of
the keyboard processor. So hardware keyboard reads can be performed by enab-
ling serial port interrupts in CIA-A (which AmigaDos does for its own use),
and writing an autovector interrupt routine to intercept the ICR SP signal
and read the serial data register.
The SP of CIA-B is connected to the Centronics BUSY signal (as is
PA0 of the CIA parallel port A) and the CNT line to the Centronics PAPER
OUT signal (as is PA1 of the CIA parallel port A).
Other Information:other connections for CIA-A are:
PC /DRDY - Centronics handshake, data ready
FLAG /ACK - Centronics handshake, data acknowledge
IRQ /INT2 input from Paula (#2)
RES /RES reset line
and connections for CIA-B are:
PC not used
FLAG /INDEX - index signal from disc drive
IRQ /INT6 input from Paula (#2)
RES /RES reset line
One final point. For those with access to FAST RAM, it is possible to examine
the Kickstart ROM on the Amiga 500. Simply clear the Memory Overlay Bit (OVL,
bit 0, CIAAPRA) from a program in FAST RAM, and the CHIP RAM memory address
range becomes mapped onto the underlying Kickstart ROM. This can be read at
will. DO NOT TRY THIS FROM A PROGRAM WITHIN CHIP RAM - THE PROGRAM WILL BE
LEFT HANGING AS THE 68000 PC REFERENCES THAT PART OF THE KICKSTART ROM WHICH
HAS THE SAME ADDRESS AS THE NEXT INSTRUCTION IN YOUR OVL SWITCHING CODE! I
HEAR THE GURU KNOCKING ALREADY...
Hardware:DMA
DMA (Direct Memory Access) is the technique used by most of the Amiga custom
chips to perform their functions. The system is organised quite neatly, in
that there are two 'halves' to the processor busses. One half is connected
to all of those components accessible solely to the 68000 (such as FAST RAM
if it is present) and the CIAs. The other half contains the CHIP RAM and the
custom chips. The two halves are separated by a buffer, which disconnects
the custom chip half from the processor whenever the processor makes an acc-
ess to the FAST RAM or the CIAs. The custom chips then have total access to
the CHIP RAM.
If the processor accesses the CHIP RAM or the custom chip registers,
then the buffer re-establishes the connection. In this case, there exists a
risk of bus contention, where two bus controllers try to take over the busses
simultaneously with obviously disastrous results. Bus accesses are nested, so
that this problem is largely avoided. Also, the processor can wait until the
bus is free if the blitter has absolute priority (this can be set under the
control of software). Also, there exist odd and even bus cycles, and DMA acc-
ess is restricted to the odd bus cycles, the even bus cycles being granted
under normal circumstances to the processor.
Right, we've got that out of the way. Now for the details. The DMA
system on the Amiga consists of DMA channels, each channel assigned to one
of the hardware functions. The full list of DMA channels is:
Bitplane DMA (6 channels): These channels are used by that part
of the hardware which takes the bit
plane data & converts it for output to
the screen. If you select fewer bit
planes, then some bitplane DMA chan-
nels remain unused.
Sprite DMA (8 channels) : These channels are used by the sprite
processor. Once a given sprite DMA
channel has been used for generating a
sprite, it can be re-used for another
(with some restrictions). See later.
Disc DMA (1 channel) : Data transfer from disc to RAM or
vice versa.
Audio DMA (4 channels) : These channels are used for processing
of audio data in RAM & passing them to
the sound chip. Incidentally, for any
one interested in music, conversion of
Amiga sound data to Fairlight synthe-
sizer data and vice versa is possible!
Paula's sound system possesses a limi-
ted Fairlight compatibility.
Copper DMA (1 channel) : This channel is used for data transfer
of command words to the Copper. If the
Copperlist tells the Copper to change
the values of registers bound to other
DMA channels, those channels are used
for those purposes selected within the
Copperlist-the Copper DMA channel is
solely for Copperlist transfer to the
Copper.
Blitter DMA (4 channels): These channels are used for data trans-
fer to and from the blitter.
Now, to make life interesting in some ways (and simpler in others) the Amiga
designers related the DMA channel priorities and usages to the construction
of the video picture, and the timings in bus cycles are related to the video
picture mainly in order to make construction of the fabulous Amiga graphics
displays easier.
What is a bus cycle? Simply put, a time span of 280 nanoseconds. This
is the time taken for a single memory access across the bus by a device using
a DMA channel. Sad to say, the 68000 cannot match this speed and needs 560
nanoseconds per memory access (two DMA bus cycles). The system is designed so
that during these two bus cycles, access to the bus is split between the DMA
channels and the 68000 as mentioned above, into odd and even cycles. Note: I
haven't distinguished between read and write memory accesses for one simple
reason, namely that both take the same time within this system.
Now, if we number the cycles from zero upwards, cycle zero is given
over initially to the processor. If the processor wants access to the bus, it
gets it initially during cycle zero. Once cycle zero has finished, cycle one
begins, which is reserved (as are all odd cycles) for the DMA controller. The
DMA controller gets access to the bus during cycle one. Once cycle one has
elapsed, cycle two begins and so on, access to the bus alternating between
the 68000 and the DMA controller. This assumes that the 68000 wants to access
the bus continuously during these cycles. If the 68000 is performing internal
processing upon register data, or accessing true FAST RAM only, then that
half of the bus connected to the DMA controller is available to the DMA con-
troller during the even cycles as well, and should this be the case, the
buffer isolates the CHIP RAM section from the 68000, and the DMA controller
can access CHIP RAM during even bus cycles as well. So, in an ideal world,
with a 10 meg FAST RAM expansion, your Amiga runs like the proverbial bat out
of hell. Should the 68000 want to access CHIP RAM, however, then the fun be-
gins, because the DMA controller is stuck with the odd cycles, except under
certain conditions.
Generally, the audio, disc and sprite DMA only use up the odd bus
cycles. Thus audio, disc and sprite accesses do not slow up the processor. If
large amounts of bitplane activity is required, or the blitter is activated,
then some of the processor's even cycles are 'stolen' for these purposes. So
when this happens, the 68000 runs more slowly.
Ok, I said that bus cycles are related to the video picture. Well,
they are. Now I'll explain how. A raster line on the screen takes 63.5 micro-
seconds to produce. This is equal to 227.5 bus cycles per raster line. Each
of the bus cycles that occur during this time are allocated for some purpose
or other. The odd cycles are generally reserved for disc, audio and sprite
DMA first, then bitplane DMA. The Copper and the blitter both use even bus
cycles (tut, tut, that was naughty, Commodore!) and thus chew into the time
available for the processor.
Anyone possessing a copy of the Amiga System Programmer's Guide will
no doubt have seen the little chart of DMA bus cycles for one raster line. If
so, it will be noticed that the little boxes representing DMA bus cycles do
not add up according to the values given along the top of each chart section.
So my attempt to list in full how the DMA bus cycles are allocated here has
run into trouble. Should anyone come up with a neat system of explaining this
that doesn't require inclusion of an IFF picture with the DOC file as an aid
please get in touch.
Now for the start of the really useful information. The DMA control
register, called DMACON, consists of two parts. There is DMACON (at offset
$96) and DMACONR (at offset $02). DMACON is a write-only register, and
DMACONR is a read-only register. Generally, if the register name ends in the
letter 'R', it's a read-only register (nice sensible convention this). Both
registers are word sized, and each bit is allocated as follows (#3,#4):
Bit Function
--- --------
15 SETIT (set or clear bits)
14 BBUSY (blitter busy-read only)
13 BZERO (blitter zero-read only)
12 Not used
11 Not used
10 BLTPRI (Blitter has absolute priority
over the 68000)
9 DMAEN (Master DMA enable)
8 BPLEN (Bitplane DMA enable)
7 COPEN (Copper DMA enable)
6 BLTEN (Blitter DMA enable)
5 SPREN (Sprite DMA enable)
4 DSKEN (Disc DMA enable)
3 AUD3EN (Enable Audio channel 3)
2 AUD2EN (Enable Audio channel 2)
1 AUD1EN (Enable Audio channel 1)
0 AUD0EN (Enable Audio channel 0)
For bits 10 down to 0, if the bit is set, the corresponding function is
enabled, else it is disabled. So to enable the blitter DMA, set bit 6 of the
DMACON register.
The DMACON register takes command words formed in a slightly odd way
if you don't understand the rationale behind it. Basically, to avoid the need
to read the current status, make a mask, exclusive-OR in your choice of bits
to change and write back the result, in order to ensure that ONLY your choice
of bits is changed, bit 15 is used to decide whether you want to set or clear
the appropriate bits (hence the name, SETIT), and the other 14 bits are used
to signal if each of the given bits is the bit of your choice (set if this is
so, clear if not). This has the welcome effect of ensuring that DMA channels
not under consideration are left alone during your write to DMACON.
Example:to enable the bitplane DMA, bit 8 must be set. The command
word thus has bit 8 set (select the bitplane DMA bit), bit 15 SET (to signal
that bit 8 is to be set) and all other bits zero (leave the other DMA chan-
nels unaffected). If I wished to disable the bitplane DMA, the command word
I would write would have bit 8 set, bit 15 CLEAR, all others clear.
Cross-reference #3 : the CIA chip documentation above refers to the
set/clear mechanism in connection with its control registers. The above para-
graphs give a complete explanation of this mechanism for anyone who has used
the cross-referencing scheme to find out more. This mechanism also is used by
the INTENA register (#4) and ADKCON (#5).
As an extra safety feature, DMA channels only become truly enabled
when the master DMA enable bit (bit 9) is set. So it is possible to select
DMA channels one at a time, then suddenly enable the lot in one go by enab-
ling the master DMA enable bit. Clearing the master DMA enable bit disables
ALL DMA channels. Of course, multiple channels can be enabled, e.g.,
MOVE.W #$8380,DMACON(A5)
enables the master DMA control, the bitplane DMA, and the Copper DMA in one
go (see the Preamble for notes on my choice of addressing mode and why).
Note that the current DMA enable status can be obtained via the
instruction
MOVE.W DMACONR(A5),D0
or something similar. A set bit indicates a DMA channel enabled.
Incidentally, should you want to kill off AmigaDos and Exec totally
(as games writers like to) then kill off all DMA using the instruction
MOVE.W #$7FFF,DMACON(A5)
in your code. This alone won't do it-you'll need to kill off the interrupts
as well-but it goes a long way toward doing precisely that. The full method
is :
1) Kill off all DMA as above;
2) Kill off all interrupts (see below for how to do it);
3) Point all 68000 interrupt vectors to your own custom interrupt
handlers, or RTE if you don't want to handle a given IPLx int-
errupt level;
4) Get into supervisor mode & change the 68000 IPLx level to make
the 68000 acknowledge only those interrupts you want it to (the
4703 actually generates them-see below for how to make the 4703
generate only those interrupts that you want) using MOVE SR
(VERY NAUGHTY! SPANKING TIME FOR ALL YOU FETISHISTS OUT THERE!);
5) Set up your own DMA system but DON'T ENABLE YET;
6) Enable 4703 interrupts (again see below);
7) NOW ENABLE YOUR DMA!
At this point Exec etc., is dead and beyond resurrection other than
via a hard reset. If that's what you want, so be it.
Hardware:Interrupts
Ok, I've already mentioned a little about interrupts above in the DMA section
but not the whole story. Now is the time to correct the omissions.
I've already mentioned the existence of the 4703 interrupt control-
ler, which actually generates the interrupts. All that the 68000 does within
the Amiga is respond to these externally generated interrupts if the value of
the IPLx bits allows it to. The standard value of SR on the Amiga is $0100 in
user mode, $2100 in supervisor mode, and all interrupts from level 1 onwards
are responded to-there's a hell of a lot of interrupts on this computer!
The 4703 interrupt controller is programmed via two sets of custom
chip registers. Again, each set has a read-only and a write-only component.
These registers are:
INTREQ (offset $09C) : write only
INTREQR (offset $01E) : read only
INTENA (offset $09A) : write only
INTENAR (offset $01C) : read only
INTENA is the interrupt enable register, and INTREQ is the interrupt request
register. Again, the read-only ones end in 'R'. The structure of all four of
these registers is identical-they are all split into individual bits, which
are assigned as follows (the numbers in square brackets in the right-hand
column correspoond to the 68000 IPLx priority of the said source-note that
the NMI interrupt of priority 7 is never used):
Bit Function
--- --------
15 SETIT (just like the DMACON register above)
14 INTEN (master interrupt enable) [6] *
13 EXTER (int. from CIA-B or expansion port) [6]
12 DSKSYN (disc sync value recognised) [5]
11 RBF (serial receive buffer full) [5]
10 AUD3 (output audio data channel 3) [4]
9 AUD2 (output audio data channel 2) [4]
8 AUD1 (output audio data channel 1) [4]
7 AUD0 (output audio data channel 0) [4]
6 BLIT (blitter ready) [3]
5 VERTB (vertical blank interrupt) [3]
4 COPER (Copper interrupt) [3] *
3 PORTS (CIA-A or exapnsion port) [2]
2 SOFT (reserved for software interrupts) [2] *
1 DSKBLK (Disc DMA transfer done) [1]
0 TBE (serial transmit buffer empty) [1]
Once again, if the corresponding bit from bit 13 down is set in the INTENA
register, that interrupt source is enabled. If one of those bits is set by
an interrupt source in the INTREQ register, the corresponding interrupt is
generated & sent to the 68000. Setting or clearing the bits in INTENA/INTREQ
is done in an indentical fashion to the DMACON register above. Again, the
state of the master interrupt enable bit (bit 14) determines whether the 4703
can generate any interrupts at all. If bit 14 is clear, NO interrupts will be
generated by the 4703. Needless to say, having decided which interrupts to
enable, one can enable them singly or in one go by choosing the appropriate
value to write to INTENA. So, if the INTEN bit is set, and the given bit for
the interrupt source is set, that interrupt CAN be generated, even if there
is no guarantee that it WILL be generated.
Ok, we can decide which interrupts to respond to. How does the 4703
generate them? Simple. Any interrupt source setting a bit in the INTREQ reg-
ister causes the 4703 to generate the appropriate interrupt, at which point
the 68000 gets to know about it. This can either happen in hardware e.g., the
blitter finishes its job & posts its interrupt signal) or can be performed in
software, e.g., by setting the SOFT bit in your code directly-this will gene-
rate a software interrupt signal as used by Exec for its softints, but unless
Exec is alive and well, DON'T expect Exec to handle it like a softint! There
is a special case, the Copper interrupt. The Copper can be made to set its
own reserved Copper interrupt bit directly within a Copperlist (see later),
as a means of forcing a Copper interrupt other than the vertical blank which
is handled in hardware, and so is totally under your control. Those bits of
INTREQ capable of being set in software by your programs at will are marked
in the above list with a '*'. The others CAN be set, but care must be exer-
cised if you are to do this in your code (Usually done if you are writing a
set of interrupt handlers to handle the functions on a continuous basis, and
HAVE to set the appropriate INTREQ bit to start the sequence off).
So, the 68000 hears about an interrupt if:
1) INTEN (master interrupt enable) is set in INTENA;
2) The corresponding interrupt source bit is set in INTENA;
3) The corresponding interrupt source bit is also set in
INTREQ.
All three conditions need to be fulfilled, else the 68000 will think that no
interrupts are being generated by the interrupt source concerned. Further in-
formation is available in the file 'typed_interrupts.doc' in this series.
Cross-reference #4:for a full explanation of the SETIT bit, see the
cross-reference #3 above.
What do we do now? Well, you need an interrupt handler terminated by
an RTE & the appropriate 68000 interrupt vector changed to point to it. If
that interrupt is enabled, and its appropriate bit in INTREQ is set, then the
handler will be called. Your handler should do the following:
1) Read INTREQR to find out which interrupt request was made. Some
of the 68000 interrupt handlers will have to handle more than
one interrupt source, and need to know which caused the inter-
rupt exception.
2) If the bit corresponding to the interrupt source that you are
interested in is set in INTERQR , then CLEAR it in INTREQ to
signal that you've acknowledged it. The 4703 can now handle
another interrupt of the same type. Incidentally, things will
get interesting if the interrupt source posts its interrupts
faster than you can respond to them!
3) Process your interrupt as you wish from this point on.
This, of course, is over and above the usual things that interrupt handlers
are supposed to do, such as save scratch registers on the stack & other junk
that you should already be aware of. And for crying out loud, please use the
MOVEM instruction to do it! Note that unlike the CIAs above, the INTREQ bits
are NOT cleared when INTREQR is read! You have to clear them the hard way! I
also would like to point out that setting the SETIT bit of INTREQ will cause
a level 6 autovector interrupt to occur, so if you want, you can use this to
create your own level 6 soft interrupts-nothing else (other than the Copper
should you want it to) will set this bit but your code. Gee, isn't that nice?
A word of warning:on my machine at least (this may vary), something
quaint happens if one tries performing the instruction
CLR.W INTREQ(A5)
or similar, due to the read/write access of the 68000 CLR instruction (see
the Preamble). The machine seems to 'hiccup' before carrying on as normal. I
would not suggest doing this too often, it might blow something expensive to
mend. In any case, the above instruction does bugger all to the status of the
interrupt system for reasons made obvious upon analysis (see #3 above), so I
don't think that there's much point in doing it-I only did it by accident. I
cannot stress too much, however, that the Preamble caveat about CLR should be
adhered to (I only found out after extensive use of CLR in my code & had to
rip it all apart again to make it work properly. Some unfortunates have some
copies of my old code instead of the CLR-free versions & wonder what the hell
is going on when they run it...).
Hardware:The Copper
This is one of the harder chapters to write, and for a very good reason. The
Copper is a very powerful little piece of hardware, but with that power comes
complexity.
The Copper is a coprocessor capable of writing values to the custom
chip registers independently of the 68000, and of performing actions based on
the position of the video beam. All in all, a highly useful little fellow. As
befits what is a processor in its own right, it has its own machine language
and it is programs written in this special Copper machine language that are
the famous Copperlists of Amiga parlance. At this point, those thinking to
themselves 'oh, no, not another machine language to learn!' should be reas-
sured by the knowledge that there are only three basic instruction types in
Copper machine language. These three basic instruction types are versatile,
however, and much can be done with them.
Sad to say, Copperlists have to be created by hand as far as I am
aware, at least if you want to take advantage of ALL Copperlist features.
I have been told that the Argasm assembler can take Copperlists in the form
of Copper assembly language & assemble them, but I've yet to check this. And
yes, there is a Copper assembly language to make life a little easier when
creating Copperlists.
So, to the Copper Assembly Language (CAL for short). The three basic
CAL instructions are:
MOVE : Write an immediate data value into a custom chip
register (like the 68000 MOVE #nnnn,xxxx);
WAIT : Wait until the electron beam generating the video
picture has reached a certain position;
SKIP : If the electron beam has reached the specified
position, skip the next CAL instruction, else
execute sequentially as normal.
Doesn't seem much, does it? Well, you can do a hell of a lot with this to
hand. I shall deal with the instructions in turn.
The MOVE instruction:this instruction allows the Copper to write an
immediate value into a custom-chip register. The register is specified in the
instruction as an offset from the base address $DFF000 (now see why I prefer
using offset(An) in 68000 for addressing custom chip registers. The Copper
does it in hardware! Makes for consistency in programming) and the value to
write is always a WORD value. The CAL syntax is
MOVE #value,register
for anyone blessed with a CAL assembler (or Argasm if it does this for you).
The actual way that the instruction is coded as machine words in memory is:
%0000000rrrrrrrr0,$XXXX
Here, rrrrrrrr represents the register address. Since all register offsets
from $DFF000 are even, bit 0 of the first word of a MOVE instruction is al-
ways zero. The second word contains the 16-bit value to write to the chosen
custom-chip register. For example, to write the value corresponding to the
colour light green into palette register 3, one would use the CAL syntax
MOVE #$03C3,COLOR03
which would become (since COLOR03 is at offset $186) the machine words
$0186,$03C3
This seems simple enough. Now, the fun starts. There is a restriction upon
the registers that can be written to by the Copper. Under normal circumstan-
ces, the Copper cannot write to registers from offsets $000 to $07E (most of
these are read-only anyway). There exists a special custom chip register, the
COPCON register, consisting of one bit (bit 0). If this bit, which is called
the Copper Danger Bit or CDANG bit, is set, then the Copper can access the
custom chip registers from offsets $040 to $07E, which just happen to be the
blitter control registers. Access to the registers from offsets $000 to $03E
is NEVER allowed. COPCON itself is at $02E (write-only) and inaccessible to
the Copper itself (the 68000 must write to this register).
So, bearing this restriction in mind, the Copper can write to most
of the custom chip registers, and if allowed to by the 68000, can write to
the blitter control registers and influence the blitter. This alone gives the
Copper considerable power within the system. In particular, the Copper can
change the DMA enable status, the interrupt enable status, the sprite control
registers, the palette, the bitplane control, the sound chip, and to a limit-
ed extent, the disc controller! Needless to say, it's only safe to let the
Copper do all this when you know how it's all done. See each section in turn
for the requisite information.
The WAIT instruction:this instruction causes the Copper to do just
that, wait. The Copper waits for a specific position to be reached by the
electron beam generating the video picture before continuing execution of
the remaining instructions in the Copperlist. This is how various tricks are
achieved such as changing background colours at specified screen positions
to create 'sunset' effects etc. The Copperlist contains a sequence of WAITs
interspersed with MOVEs to the background colour palette register, COLOR00 at
offset $180. Should the electron beam have already passed the given position,
normal sequential execution is resumed. The CAL syntax for WAIT is
WAIT (x1,y1) MASK (x2,y2) BFD
(Anyone used to the CMOVE/CWAIT macros on Devpac here forget those-they never
use the full power of the WAIT instruction. This is MY defined CAL syntax for
the WAIT instruction, that tells you everything).
In this syntax, x1 and y1 are the beam position to wait for. In this
syntax, the MASK and BFD entries are optional and can be omitted, but if they
are included have a profound effect. More of this below. If omitted, the WAIT
instruction makes the Copper wait until the beam reaches position (x1,y1) and
then resume normal sequential execution.
If the MASK specifier is included, the fun starts. Instead of using
(x1,y1) directly, the comparison of the beam position is made with the values
formed by logically ANDing together x1 and x2 for the horizontal position, &
ANDing together y1 and y2 for the vertical position. Omitting the MASK speci-
fier is to be regarded as having the same effect as having a MASK value of
(-1,-1), or all 1's in binary (in which case x1 AND x2 becomes x1, etc.).
This opens up many possibilities. For example, in the instruction
WAIT (0,$0F) MASK (-1,$0F)
the WAIT condition will be fulfilled every 16 lines, i.e., whenever the lower
four bits of y1 are all 1. Note the -1 mask value for the horizontal position
(i.e., all 1's binary). Since in this example I am not interested in the hor-
izontal position at all, I could have had a horizontal mask of 0, but if the
horizontal position is important, choose the mask accordingly. The mask bits
affect BOTH the specified position in the instruction AND the actual beam
position coordinates before the position comparison is performed.
The machine words for the WAIT instruction take the form:
%vvvvvvvvhhhhhhhh1,%bvvvvvvvhhhhhhh0
In the first word, the vvv bits specify the vertical beam position, and the
hhh bits the horizontal beam position. Note that bit 0 is equal to 1. This
distinguishes WAIT (and SKIP below) from MOVE. WAIT is distinguished from
SKIP by having bit 0 of the second machine word set to zero (for SKIP, this
is set to 1).
In the second word, the b bit is the Blitter Finish Disable bit, or
BFD bit. In my CAL syntax, including BFD in the instruction specification has
the meaning 'set the BFD bit in the second word'. This bit is used when the
Copper is used to start a blitter operation. The Copper in general must know
when the blitter has finished whatever blitter operation was started at some
time past, whether started by the Copper or the 68000. If the BFD bit is zero
(omitting BFD in the CAL syntax means 'clear the BFD bit') then the Copper
will WAIT until the blitter finishes, and THEN check the wait condition. If
the BFD bit is one, the blitter status is ignored. If COPCON=0, and the
Copper does not affect the blitter in any way, BFD should be set to 1.
In the second word, the vvv bits are the vertical mask, and the hhh
bits the horizontal mask. If no mask is specified, these are all set to 1.
Note that bit 7 of the vertical position cannot be masked. If a vertical mask
is specified, then bit 7 of the vertical position is always treated as though
the mask bit for that position was 1. Note that since there are 313 lines of
display, and the vertical position is only 8 bits wide (and the vertical mask
7 bits wide), that WAITs for vertical positions greater than 255 have to be
performed as
WAIT (0,255)
WAIT (0,y)
NOTE : Horizontal positions are specified in steps of 4 low-resolution pixels
and NOT in pixel coordinates directly!
The SKIP instruction:the only difference between this instruction &
the WAIT instruction in terms of machine words is that bit 0 of the second
word is 1 instead of 0. The bits are otherwise identical in format to those
for the WAIT instruction.
The SKIP instruction allows conditional branches to be set up within
a Copperlist. The mechanism is slightly quirky, however. Basically, the beam
position is compared with the (x1,y1) arguments as for the WAIT instruction,
with MASK data also applying identically. If the beam position is greater
than or equal to the (x1,y1) argument (MASK notwithstanding) then the Copper
skips the next Copperlist instruction, and moves on directly to the instruc-
tion following it. Otherwise, instruction execution continues in the normal
sequential fashion. Full information on conditional branches using this in-
struction is given below. As may be expected, my CAL syntax for the SKIP in-
struction is
SKIP (x1,y1) MASK (x2,y2) BFD
just as for the WAIT instruction. Comments regarding the optional MASK and
BFD arguments applying to WAIT apply identically to SKIP.
Now, we have the three Copper instructions. A Copperlist is simply
a sequential list of these instructions, in machine word format, in memory.
As may be expected, the Copperlist MUST be in CHIP RAM.
Having read this far, one may wonder about how a Copperlist is ter-
minated. A trick is used here. The final instruction in a Copperlist is a
WAIT instruction for an impossible beam position, such as WAIT (0,$FE). This
condition will never be fulfilled because a horizontal beam position greater
than $E4 isn't possible. I personally use the machine words
DC.W $FFFF,$FFFE
as the end of my Copperlist. When the beam reaches the end of the video pic-
ture, the Copper is automatically restarted at the start of the Copperlist
(unless you arrange otherwise!).
Ok. We have a Copperlist, complete with the impossible WAIT at the
end. How do we tell the Copper to execute our Copperlist? The Copper has a
set of registers for this. These are:
COP1LCH offset $080
COP1LCL offset $082
COP2LCH offset $084
COP2LCL offset $086
COPJMP1 offset $088
COPJMP2 offset $08A
The two register pairs, COP1LCH/L and COP2LCH/L, are loaded with the 18-bit
CHIP RAM address of the start of your Copperlist (or some other address-see
later!). For simple Copperlists, COP1LCH/L alone is used. Having done this,
and turned on the Copper DMA, writing any value to COPJMP1 starts the Copper
executing your Copperlist. When the Copper reaches the impossible WAIT ins-
truction at the end of your Copperlist, it will WAIT until the vertical blank
occurs, at which point the Copper will restart at the address loaded into the
COP1LCH/L pair. More correctly, COPJMP1 causes the data in the COP1LCH/L pair
to be transferred to the Copper's internal program counter, and execution in-
itiated. If you have an address in COP2LCH/L, writing to COPJMP2 will cause
that value to be loaded instead. For simple Copperlists (no interlace or any
conditional branches) the COP1LCH/L and COPJMP1 set are the defaults used, &
these will be used by the Copper for restarting the Copperlist execution at
the vertical blank interval. NOTE:the Copper needs its 'programs' aligned on
a word boundary just like the 68000. In fact, word alignment holds for all
custom chip operations - unless I discover any exceptions and document them
in this file, treat word alignment as mandatory.
Note that the COPxLCH/L register pairs are NOT program counters! The
Copper's program counter is NOT directly accessible, and the values stored in
the COPxLCH/L register pairs are INITIAL program counter values, needing to
be loaded only once under normal circumstances. The values in COPxLCH/L never
change once loaded, unless 1) they are changed by the 68000 at some future
time (under your control!); 2) the Copperlist contains instructions such as
MOVE #value_hi,COP1LCH ;high word of 18-bit value
MOVE #value_lo,COP1LCL ;low word of same
which causes the Copper to alter them itself.
Conditional Branches:now it may already have become obvious to the
astute reader how to form a conditional branch in CAL. For those who haven't
yet worked out all of the details, here they are.
First, one needs to know the absolute address in memory of the point
to which to branch. BEFORE the branch is to be executed, reload COP1LCH/L
with this new address-the Copper can be made to do it using instructions such
as the example above. Then immediately after a SKIP instruction, place the
Copper instruction
MOVE #0,COPJMP1
and the rest of the Copperlist following this. If the beam position when the
Copper reaches the SKIP instruction is greater than the SKIP instruction's
position argument, the MOVE above will be skipped, and the remaining inst-
ructions of the Copperlist following the MOVE executed. If the position is
less than the argument, the MOVE above will be executed, and the Copper will
load its internal program counter with the new value of COP1LCH/L that you
forced it to earlier on. Hey presto! Conditional branch! Of course, if you
wish to leave COP1LCH/L alone, you can use COP2LCH/L instead, and use a MOVE
to COPJMP2 to cause the branch instead.
WARNING:If you change COP1LCH/L in order to force a conditional br-
anch using the above mechanism, REMEMBER TO RESET IT TO POINT TO THE START OF
YOUR COPPERLIST AGAIN AFTER THE BRANCH! You have to do this in two places in
your Copperlist, once after the MOVE to COPJMP1 (to take account of branch
not performed) and once immediately after the branch point (to take account
of branch performed). Failure to do this will result in the Copper failing
to execute portions of your Copperlist after the first execution!
Interlaced playfields:you need two Copper lists for this, one for
the long frame and one for the short frame. The long frame Copperlist (the
first one) should initialise bitplane pointers to point to the FIRST line of
the bitplanes, and the short frame Copperlist should initialise the bitplane
pointers to point to the SECOND line of the bitplanes. At the end of the long
frame Copperlist, before the impossible WAIT, insert two instructions to set
the COP1LCH/L pair to point to the short frame Copperlist. Similarly, at the
end of the short frame Copperlist, place instructions to point COP1LCH/L at
the start of the long frame Copperlist. The Copper will then alternate back
and forth between the two Copperlists. In addition, the bitplane control (see
below) needs to have the LACE bit set, and various other instructions need to
be executed to ensure proper system synchronisation, and ensure that your in-
terlaced playfields are displayed properly. More details in the section below
on bitplane control.
Note:incorrect setting of the Copper registers can lead to the so-
called FIREWORKS_MODE of the Amiga occurring. This occurs when the Copper is
pointed to an invalid area of memory, and the Copper tries to execute what it
thinks is a Copperlist there. The Copper isn't intelligent (like some COBOL
programmers I know) and thinks that anywhere it's pointed to is a Copperlist
to execute, and happily runs it. This usually has weird and wonderful effects
such as screwing up the screen completely. This phenomenon (the runaway Cop-
per syndrome) is the ONE event that can crash the Amiga completely, beyond
even a Guru recovery. It's sometimes pretty to watch, but can result in your
Amiga containing a fried Agnus or something else fatal if you don't switch
off the moment it happens. YOU HAVE BEEN WARNED.
Copper interrupt:In the section on interrupt control, it was stated
that the Copper has its own interrupt control bit in INTENA/INTREQ. To signal
a Copper interrupt, place the instruction
MOVE #$8010,INTREQ
into your Copperlist at the desired point, and the Copper will force the int-
errupt system to generate the Copper interrupt. Any interrupt bit can be set
this way, but the above instruction sets bit 4 of INTREQ, specially provided
for the Copper. By placing this instruction after suitable WAIT instructions
one can tell the 68000 that a given screen position has been reached, and is
the recommended method of setting up Raster Interrupts. Amiga Raster Inter-
rupts are completely controllable, and can be made to occur not only at a
given raster line position, but at a given screen column as well! Using the
MASK option in WAIT allows all manner of wonderful tricks to be performed.
The only limit is your imagination at this point.
Hardware:Bitplanes & Bitplane Control
Having discovered how to generate a Copperlist, the next logical step is to
learn how to control the bitplane usage. All bitplane control registers are
accessible to the Copper (except for those below offset $040), and thus one
can set up a Copperlist to set these registers to given values. This is of
particular value for interlaced playfields, already mentioned in the Copper
section above, but can be performed for any type of playfield if wanted.
Bitplane Control Registers:the full list of bitplane control regis-
ters is:
VPOSR (offset $004) Read MSB of vertical beam position
VHPOSR (offset $006) Read vertical/horizontal beam position
VPOSW (offset $02A) Write MSB of vertical beam position
VHPOSW (offset $02C) Write vertical/horizontal beam pos
DIWSTART (offset $08E) set top left corner of display window
DIWSTOP (offset $090) set bottom right corner of same
DDFSTART (offset $902) horiz. pos start of bitplane DMA fetch
DDFSTOP (offset $094) horiz. pos end of bitplane DMA fetch
BPL1PTH (offset $0E0) bitplane pointers, high/low, for
BPL1PTL (offset $0E2) up to 6 bitplanes as wanted
BPL2PTH (offset $0E4)
BPL2PTL (offset $0E6)
BPL3PTH (offset $0E8)
BPL3PTL (offset $0EA)
BPL4PTH (offset $0EC)
BPL4PTL (offset $0EE)
BPL5PTH (offset $0F0)
BPL5PTL (offset $0F2)
BPL6PTH (offset $0F4)
BPL6PTL (offset $0F6)
BPLCON0 (offset $100) main bitplane control register
BPLCON1 (offset $102) scroll values for outsize playfields
BPLCON2 (offset $104) sprite/playfield & DUALPF control
BPL1MOD (offset $108) bitplane modulo for odd planes
BPL2MOD (offset $10A) bitplane modulo for even planes
The main bitplane control register BPLCON0 is organised as follows (bit set
equals function enabled, bit clear equals function disabled):
Bit Function
--- --------
15 HIRES Turn on high-resolution mode
14 BPU2 These three bits contain the
13 BPU1 number of bitplanes used
12 BPU0
11 HOMOD Hold & Modify mode on
10 DBPLF Dual Playfield mode on
9 COLOR Video output colour (always set this!)
8 GAUD Genlock Audio on
7-4 Unused
3 LPEN Lightpen input active
2 LACE Interlace mode on
1 ERSY External synchronisation on
0 Unused
Some restrictions exist. HOMOD and DBPLF cannot both be set simultaneously,
one or the other only can be set. Both bits can be cleared, however, and if
all six bitplanes are enabled, the hardware automatically selects the EXTRA-
HALFBRITE mode. Also, GAUD and ERSY are only useful with a Genlock interface
(DO NOT SET THEM UNLESS YOU'RE USING ONE!). The legal value range for the
BPUx bits is 0 to 6, 7 is not allowed. Don't ask me why 0 is allowed...
Having decided which screen mode is desired, one then needs to set
the bitplane sizes. The registers DIWSTART (Display Window Start) and DIWSTOP
(Display Window Stop) are used for this. Bits 15-8 contain the vertical pos-
ition, and bits 7-0 the horizontal position.
DIWSTART is assumed to rest in the top left quadrant of the screen.
This is a fairly sensible assumption, after all. Because the vertical pos-
ition can be from 0 to 313, which needs 9 bits, the top bit (not specified)
is assumed to be zero, giving vertical positions from 0 to 255. Similarly,
the missing 8th bit of the horizontal position is assumed to be 0, giving a
horizontal position from 0 to 255.
DIWSTOP is a little more complicated. It is assumed to lie in the
lower right quadrant of the screen (sensible again) and hence the 9th bit
of the horizontal position is assumed to be 1, giving horizontal positions
from 256 to 448. Because vertical end positions both greater than and less
than 255 should be possible, a trick is used. Bit 15 (the 7th bit of the
vertical position) of DIWSTOP is inverted to provide the 8th bit, making an
end position of 128 to 312 possible. For end positions from 256 to 312, make
this bit zero (thus making the hidden 8th bit equal to 1), and for end posi-
tions from 128 to 255, make this bit 1 (thus making the hidden 8th bit 0).
Also, DIWSTOP should have the horizontal and vertical values PLUS ONE set
into it to work properly.
Typical PAL values for the screen are TLC (top left corner) coord-
inates (129,41), and BRC (bottom right corner) coordinates (448,296). This
corresponds to DIWSTART = $2981, and since DIWSTOP should contain the values
(449,297) instead of (448,296), DIWSTOP = $29C1 is used. This produces a PAL
320 x 256 display area centred in the middle of the monitor display.
Limitations exist on these values. Firstly, monitor tube distortions
limit the values (the corners will be cut off if the entire monitor screen
area is used), and the blanking gaps need to be taken into account. Vertical
blanking gaps occupy lines 0 to 25, making the earliest TLC vertical position
26 lines from the VBL start, and the latest BRC vertical position is 312. The
horizontal situation is more complex. The horizontal blanking gap (HBL) lies
between colums 30 and 106. Horizontal positions from 107 are possible.
We have set the screen mode, and the bitplane size. Now we need to
set up the bitplane DMA. The DMA data fetch must start in synchronisation
with the start & stop values, to ensure that the pixels appear in the right
places on screen. Vertically, this is no problem. Screen DMA starts and ends
in synchronisation with the DIWSTART/DIWSTOP vertical positions automatically
and no register control of this is provided. Horizontally, this is a problem
however. To display a pixel on the screen, the current word needs to be read
from each bitplane. for 6 bitplanes, low-resolution, 8 bus cycles are needed.
In addition, the hardware needs a half bus cycle before the data can appear
on the screen. The bitplane DMA must therefore start exactly 8.5 cycles or
17 pixels before the start of the screen window for low-resolution screens,
and 4.5 cycles or 9 pixels before the start of the screen window for high-
resolution screens.
The registers controlling this data fetch are DDFSTART and DDFSTOP
(Display Data Fetch Start, and Display Data fetch Stop). Only bits 7 to 2
are writeable, the others are "don't care" bits and should be set to 0. Bit
2, the lowest writeable bit, should always be 0 for low-resolution screens
since the bitplanes are read once every 8 bus cycles, and the values of
both DDFSTART and DDFSTOP must be an exact multiple of 8. Regardless of the
resolution, the difference between DDFSTART and DDFSTOP (the Amiga System
Programmer's Guide is littered with misprints about here!) must always be
divisible by 8, since the hardware always divides the lines into sections
of 8 bus cycles each. Even in high-res mode, the bitplane DMA is performed
for 8 bus cycles beyond DDFSTOP, so that 32 points are always read.
Let H equal the horizontal start of the screen, as set in DIWSTART.
Also, let P equal the number of pixels per line. The values for DDFSTART and
DDFSTOP are thus computed as:
Low Resolution : DDFSTART = (H/2 - 8.5) AND $FFF8
DDFSTOP = DDFSTART + P/2 - 8
High resolution : DDFSTART = (H/2 - 4.5) AND $FFF8
DDFSTOP = DDFSTART + P/4 - 8
For our standard PAL window of 320x256 centred as above, we have H=129, P=320
and the values are thus
DDFSTART : (129/2 - 8.5) AND $FFF8 = $38
DDFSTOP : $38 + 320/2 - 8 = $D0
For a high-resolution screen of 640x256 centred as above, we now have H=129,
P=640, and the values are thus
DDFSTART : (129/2 - 4.5) AND $FFF8 = $3C
DDFSTOP : $3C + 640/4 - 8 = $D4
DDFSTART cannot be less than $18. This is because the first $18 bus cycles
are reserved for the memory refresh, disc and audio DMA, and the DMA channel
for sprite 0 (used as the mouse pointer) which cannot be turned off. DDFSTOP
is limited to a maximum of $D8 (horizontal blank occurs beyond this!).
Bitplane Pointers:the list above of registers includes the bitplane
pointers BPLxPTH/L. Each word-sized 'L' register combines with its 'H' coun-
terpart to form a pointer into the CHIP RAM. By setting these pointers to the
address of the start of each portion of bitplane memory, and then setting the
BPLCON0 register for the appropriate screen mode, bitplane control is almost
complete.
Note that in this case, the bitplane pointer contents are CHANGED by
the operation of the system, as opposed to the COPxLCH/L registers above. As
a result, the bitplane pointers need to be reset after each use, either by an
appropriate Copperlist, or by the 68000 during the VBL. There exist six other
registers called BPLxDAT which are accessible only by the DMA system. When a
BPLxPTH/L register pair is accessed to obtain a bitplane address, it is inc-
remented by two after the requisite data word is accessed and passed on to
the BPLxDAT register. Once the full complement of BPLxDAT registers for the
given screen are loaded, their data is passed to the display electronics, and
the process repeated. Either at the VBL or within the Copperlist, each of the
BPLxPTH/L registers need to be reset as a result. The actual reading of the
bitplane data occurs during the interval between the occurrence of DDFSTART &
DDFSTOP. After DDFSTOP has been reached, the bitplane pointers are changed by
the values contained in the BPLxMOD registers, and under normal circumstances
these registers are set to zero. The BPLxMOD registers will be covered more
fully later on in this section.
So far, this information assumes that the playfields are designed to
be the same size as the area displayed. It is perfectly possible to design a
collection of outsize playfields larger than the display area, and display a
portion of each. The playfield can be extra-tall, extra-wide or both, making
screen scrolling almost ridiculously easy in comparison with other computers
such as the Atari ST.
To manage extra-tall playfields is simplicity itself. Simply alter
the values of BPLxPTH/L used as the start point for vertical scrolling. If
this cannot be done in a Copperlist, use the 68000 during the VBL. One way
of making the Copperlist handle it is to write to the Copperlist directly,
using the Copper interrupt to signal that the Copper has executed beyond the
point at which you wish to write the new addresses into the Copperlist, and
performing the write operation during the Copper interrupt. Alternatively one
can use the 68000 during the VBL interrupt, storing the true base pointers
and the scrolled values somewhere safe beforehand, and updating the scrolled
values each time a scroll is performed.
Note that I mention using the Copper interrupt to signal that it is
safe to write into the Copperlist. If this is not done, it is possible to
write into the Copperlist at the same point being accessed by the Copper, and
thus ensuring that the Copper gets the wrong data. Use of the Copper inter-
rupt ensures that the Copper has genuinely finished with the portion of the
Copperlist being rewritten.
Managing extra-wide playfields is a little more complicated, but the
hardware makes for almost unbelievable speed in pixel-boundary horizontal
scrolling. BPLCON2 is used to control the pixel offset from 0-15 (remember,
the DMA system accesses bitplanes in 16-bit words, corresponding to 16 pixels
of bitplane data). Bits 15-8 of BPLCON2 are unused. Bits 7-4 are used to con-
trol the pixel offset for the odd planes, and bits 3-0 are used to control
the pixel offset for the even planes. BPLCON2 determines the number of pixels
to the LEFT that the screen is scrolled, so for scrolling to the RIGHT, one
must use 16-X (where X is the left scroll value) and add 2 to each of the
bitplane pointers.
Also, to ensure that the extra-wide playfield is properly displayed,
there exist modulo registers. Modulo registers are used extensively within
the Amiga hardware, particularly by the blitter. A modulo register contains a
value to be added on to a pointer register pair value in order to ensure that
the pointer points to the correct data word after a series of operations. An
example will illustrate.
Let us create a 640-pixel low-resolution display. This is twice as
wide as the standard display of 320 pixels. After reading the first 20 words
of the display (40 bytes), the bitplane pointers are pointing to word 21. In
a normal display, the BPLxMOD registers contain zero, and this is added on to
the BPLxPTH/L values to reference the next line. For our double-width play-
field, this is not correct. We want to skip another 20 words (40 bytes) to
reference the second line correctly. This is done by setting BPLxMOD to 40.
This is then added on to the BPLxPTH/L pairs by the system and the second
line of our double-width playfield is thus referenced correctly by the DMA
system.
Of course, both can be combined to make a huge display area, the
sole limitations being available CHIP memory and your imagination. This can
then be scrolled around at will.
Note that for smooth scrolling, the scroll values MUST be changed
outside the time used for displaying the actual bitplanes. This again is
possible using the Copper or the VBL interrupt as above.
Basically, scrolling smoothly is achieved by keeping a pixel scroll
value saved somewhere as well as the bitplane pointers. To scroll left, take
the pixel scroll value, add 1, AND with $0F and save back. If the result is
zero, add 2 to all bitplane pointers. Then write these values into the vari-
ous BPLCON2/BPLxPTH/L registers. For smooth right scrolling, take the pixel
scroll value, subtract 1, AND with $0F, save back. If the result equals $0F
then subtract 2 from all bitplane pointers. Write all of the resulting data
into the requisite registers. So now you know.
Double-Buffering:at this point, any reader having digested both of
the sections on Copperlists and Bitplane Control will have all the informa-
tion to hand to perform double-buffering in hardware. Set up a Copperlist for
the desired screen, complete with bitplane pointer initialisation. I find it
useful to refer to the screen within which rendering is performed as the
logical screen, and the screen currently being displayed as the physical
screen. The double-buffering technique keeps these screens separate. Set up
the Copperlist initially to point to one of the screens, which will become
the physical screen. Perform all rendering in the other screen, which will
become the logical screen, and use the Copper interrupt to determine when it
is safe to change the bitplane pointer initialisation section to restart the
Copperlist after the VBL with the identities of the two screens changed. The
previous logical screen, within which one has rendered all graphic data, will
become the new physical screen, and the current physical screen will then be-
come the new logical screen. Again, perform all rendering in the logical scr-
een. Provided that all rendering can be performed within the time taken to
display one frame (1/50th of a second), the resulting motions of graphic en-
tities within your program will be completely smooth and flicker-free. This
technique requires two sets of screen memory and is thus memory-hungry, but
it is the basic technique for most games requiring smooth object motions. I
shall add at this point that it may be possible to achieve smooth motion by
this means even if it takes up to three frames to perform all rendering, as
the movement of objects within 'Strike Force Harrier' on the ST is reason-
ably smooth (I once worked for the author of that game) even though frame
swapping only occurs at 16 frames per second. With the Amiga's far superior
hardware is should be possible to perform similar rendering at 24 frames per
second or even faster (Strike Force Harrier has up to 32 bob-type objects on
screen at once, hence the time taken for rendering!), and 50 frames per sec-
ond animation is perfectly possible with fewer objects to move, unless they
are truly huge (but see 'Menace'-some of that program's bobs are of a vast
size). With dual playfield mode and oversize playfields, it's even possible
to perform fast rendering using the blitter (see later) and perform parallax
scrolling or even two-direction scrolling!
Hardware:Sprite Management
This section has not been thoroughly tested by me for all of the possibili-
ties, because I haven't had occasion to use hardware sprites yet. However,
a mini-preamble will serve to open up ideas.
Denise, the chip responsible for sprite management, is a high-speed
sprite processor using its own DMA channels (8 in all). The existence of 8
DMA channels for sprite processing does NOT limit the programmer to 8 sprites
as on some lesser systems, and with clever programming it is possible to have
up to 72 sprites moving about at once! But before explaining sprite DMA chan-
nel reuse, the technique allowing this, the fundamentals should be covered.
First, some limitations. A hardware sprite has a maximum width of
16 pixels. It can be any size vertically up to the size of the entire screen
if wanted, though usually programmers work with 16x16 sprites or similar. A
sprite can be displayed anywhere on the screen, and appears in front of the
playfields. The Intuition mouse pointer is a hardware sprite, in actual fact
sprite 0, and the DMA allocation within a raster line never allows the time
allocated to sprite 0 to be stolen by bitplane DMA. If no other sprites are
used, the remaining sprite DMA slots CAN be stolen by bitplane DMA for really
wide displays, but it is a good idea not to do this during first experimenta-
tion with sprite management.
Also, a sprite is normally a 3-colour entity. It is possible to have
a 15-colour sprite by combining two sprites together. The restriction here is
that the sprites MUST be combined as follows:sprite 0 with sprite 1, sprite 2
with sprite 3, sprite 4 with sprite 5, and sprite 6 with sprite 7. No other
order is allowed.
Sprite colours are allocated differently for 3-colour sprites and
15-colour sprites. For 3-colour sprites, the allocations are:
Sprite No Colour Registers
--------- ----------------
0,1 17,18,19 (16 not used)
2,3 21,22,23 (20 not used)
4,5 25,26,27 (24 not used)
6,7 29,30,31 (28 not used)
The unused colours are treated as 'transparent', i.e., the playfield data
shows through where 'colour 0' pixels appear in the sprite, the 'colour 0'
for each sprite being thought of as corresponding to the unused colour reg-
isters. So, if sprite 0 has pixel colours 0,1,2,3, the actual colours used
are transparent,17,18,19. Needless to say, colours will only clash with the
other graphic objects on 5-bitplane screens, extra-halfbrite or HAM screens
in single-playfield mode. If your screen is 4 bitplanes or less, sprite col-
ours are independent of the main screen colours.
To put sprites on screen, almost all that is required is that the
programmer constructs a sprite data list in CHIP RAM, and passes a pointer
to the start of the sprite data list to Denise's sprite control registers.
Once that has been done, the DMA system handles the sprite all by itself.
A sprite data list consists of two control words, followed by the sprite
data itself, and then two more control words, which for a standard sprite
are zero to tell Denise that no more sprite processing is to be performed
using this DMA channel.
The initial two control words tell Denise where the sprite is to
be displayed, and also if two 3-colour sprites are combined to form one 15-
colour sprite.
Now the bad news. Allocation of bits in the sprite control words is
awkward to say the least. It runs as follows:
Control Word 1 : EEEEEEEEHHHHHHHH
Control Word 2 : LLLLLLLLA0000ELH
The E bits represent the first line of the sprite (called VSTART). Control
word 1 contains bits E7-E0 reading left to right, and control word 2 contains
E8. Bits E8-E0 make up the VSTART parameter.
The H bits represent the horizontal position of the sprite. Control
word 1 contains H8-H1 reading from left to right, and control word 2 contains
H0. Bits H8-H0 make up the horizontal position parameter, called HSTART.
The L bits represent the last line of the sprite plus one. Control
word 2 contains L7-L0 in the high byte, reading from left to right, and L8 at
bit 1 of the low byte. I told you it was bloody awkward! This value is known
as VSTOP.
The A bit in the second control word is the ATTACH bit. It tells the
sprite DMA system that this sprite is attached to another sprite, and is set
if this is the case (BUT ONLY IN THE SPRITE DATA LISTS FOR SPRITES 1,3,5,7!).
The comment in the Amiga System Programmer's Guide that these bits are divid-
ed somewhat impractically between these two control words is a masterpiece of
understatement!
Sprite resolution is one low-resolution pixel horizontally, and one
raster line vertically, and these values are constant since sprite DMA is in-
dependent of the playfield modes.
Now, the sprite data list is formed as follows for a single sprite:
Control Word 1, Control Word 2
Data Word 1 of L1, Data Word 2 of L1
Data Word 1 of L2, Data Word 2 of L2
Data Word 1 of L3, Data Word 2 of L3
Data Word 1 of L4, Data Word 2 of L4
... ...
Data Word 1 of LN, Data Word 2 of LN
Zero Word, Zero Word.
The sprite data is treated as being like mini bit planes, the data word 1
corresponding to 'bitplane 1' of the sprite, and data word 2 corresponding to
'bitplane 2' of the sprite. If a given bit in both words is 0, that pixel of
the sprite is transparent, else the colour allocation is as given in the 3-
colour sprite allocation table above. This preamble goes a long way toward
explaining why I haven't bothered with them up to now!
Now, if the ATTACH bit is set in sprite 1, this tells Denise that
sprite 1 is attached to sprite 0 to make a 15-colour sprite. In this case,
the sprite is treated as a '4-bitplane' entity, and the allocations are:
Data word 1, sprite 0 : 'bitplane 1'
Data word 2, sprite 0 : 'bitplane 2'
Data word 1, sprite 1 : 'bitplane 3'
Data word 2, sprite 1 : 'bitplane 4'
The same applies to the other sprites in combination in ascending numerical
order (PHEW!).
At this point, I warn the reader that if the sprite positions of
attached sprites do not match, Denise treats them as two separate sprites
anyway. POSITIONS OF ATTACHED SPRITES MUST BE IDENTICAL FOR THEM TO BE TREA-
TED AS ATTACHED SPRITES!
The colour allocations for a 15-colour sprite are transparent, then
all colour registers from 17 to 31 upwards, according to the value extracted
from a given bit position in each sprite data word. If, for example, bit 4 of
each sprite data word is 1,0,1,1 in the order above, this corresponds to a
colour value of %1101 or 13, and colour register 29 provides the colour value
for this pixel of the sprite. The colour register is 16+pixel value, unless
the pixel is %0000, in which case it's transparent. Again, PHEW!
Sprite DMA channel reuse:I mentioned earlier that it was possible to
display many sprites using the phrase 'sprite DMA channel reuse'. This means
that the two end control words of the sprite are not zero. To reuse a sprite
DMA channel, append the entire sprite data list of a second sprite onto the
first, replacing the zero control words of the first sprite with the starting
control words of the second. Again, if no more sprites are to be displayed,
the final control words of the entire list are zero, else the procedure of
appending a sprite data list continues for as many sprites as required, bear-
ing in mind an important limitation:there must be at least one raster line
between sprites thus appended into a reuse list, to give the DMA time to
read in the new control words.
Ok, what if you don't want to use all 8 sprites? well, turning on
sprite DMA activates all 8 sprite DMA channels, and so the unused ones must
be passed a pair of zero control words to render them inactive. One can use
the existing zero control words at the end of some genuine sprites for this
purpose.
Now for the important part. You have created your sprite data lists
and want to see them activated. Write the address of the start of each of
your sprite lists to the SPRxPTH/L register pairs. The offsets for each of
these registers are:
SPR0PTH : offset $120
SPR0PTL : offset $122
SPR1PTH : offset $124
SPR1PTL : offset $126
SPR2PTH : offset $128
SPR2PTL : offset $12A
SPR3PTH : offset $12C
SPR3PTL : offset $12E
SPR4PTH : offset $130
SPR4PTL : offset $132
SPR5PTH : offset $134
SPR5PTL : offset $136
SPR6PTH : offset $138
SPR6PTL : offset $13A
SPR7PTH : offset $13C
SPR7PTL : offset $13E
This can be done using the Copperlist as might be expected, or the hard way
using the 68000. In any case, initialisation of all of these pointers MUST
be performed in the vertical blank interval if sprite DMA is enabled even if
the registers are pointed at zero control words to disable them. Furthermore
the values stored in these registers change during sprite DMA usage, and so
every time the vertical blank interval occurs, the SPRxPTH/L registers must
be re-initialised, either by the 68000 or using a Copperlist.
Moving the sprites by changing the position data in the initial con-
trol words must also be performed during the vertical blank interval to en-
sure that Denise receives the correct data, else your sprites could jump all
over the screen in a weird and wonderful fashion! You can use the Copper int-
errupt to signal that it's safe to change them if using a Copperlist to init-
ialise the SPRxPTH/L values, by ensuring that the change only occurs AFTER
the Copper has initialised the registers. Upon initialisation, the control
words are read immediately & held for processing until the correct beam pos-
ition has been reached for displaying them, and once read, the values in the
sprite data lists can be changed safely.
Sprite and Playfield Priority:Having introduced the reader to the
hellish delights of the Sprite management system's basic control registers, I
now wish to make life even more complex by introducing sprite/playfield pri-
ority allocation.
First of all, the lower the sprite number, the higher the priority.
This means that sprite 0 hs higher priority than sprite 1, etc., and that the
sprites are displayed as though they were on separate planes, the plane for
sprite 0 being in front of the other sprite planes. In actuality, the sprite
priorities are grouped into the same pairs as for sprite attachment, particu-
larly when the playfields are brought into the whole picture.
So, considering the sprites as paired for priority purposes, in the
same manner as for sprite attachment, a playfield can be arranged in order
of priority according to the table below. In this table, P represents the
playfield, and the digit pairs 01, 23, etc., represent the sprite pairs. All
possible combinations are given below, the element to the left of the pri-
ority arrangement entry list being that with the highest priority.
Playfield Pos Priority Arrangement
------------- --------------------
0 P 01 23 45 67
1 01 P 23 45 67
2 01 23 P 45 67
3 01 23 45 P 67
4 01 23 45 67 P
Now, if only one playfield is selected, then this table holds for that play-
field. If dual playfield mode is selected, then this table holds for each of
the playfields INDEPENDENTLY (with a few limitations). The selection of the
priorities is controlled by BPLCON2 (offset $104), whose bits are allocated
as follows:
Bit Function
--- --------
15-7 Unused
6 Playfield Relative Priority
5-3 Priority of Playfield 2 rel. to sprites
2-0 Priority of Playfield 1 rel. to sprites
If bit 6 of this word is set in dual-playfield mode, playfield 2 is deemed
to have higher priority than playfield 1, else playfield 1 has priority over
playfield 2 (the usual state of affairs). Bits 5-3 determine the priority of
playfield 2 relative to the sprites, and the 3-bit value to insert here is
the value in the sprite/playfield priority table above labelled 'Playfield
Pos', corresponding to the given sprite/playfield priority in the table. In
the same way, the 3-bit value for bits 2-0 determining the priority of play-
field 1 relative to the sprites is chosen from the above table.
Now, since the two playfields have a relative priority, and each of
the playfields has its own independent priority relative to the sprites, it
is a fair question to ask whether the playfields' priority relative to each
other has precedence over their priority relative to the sprites. The answer
is YES. In the Amiga System Programmer's Guide, an example is given for the
value BPLCON2 = $0003. Here, bit 6 is zero, so playfield 1 should be in front
of playfield 2. Bits 5-3 are zero, so playfield 2 should appear in front of
all sprites from 0-7. Bits 2-0 have the value 3, meaning that playfield 1 is
in front of sprites 6 & 7, and behind all of the others. A quick glance at
this description shows something amiss. Playfield 2 cannot be in front of all
sprites and at the same time behind playfield 1 (which is behind sprites 0 to
5). When one of the sprites 0-5 is BETWEEN playfields 1 and 2, it appears in
front of playfield 1, according to its priority. Since this is in front of
playfield 2, the sprite is visible at this point, although it must actually
be behind playfield 2. If only playfield 2 and the sprite are at a given pos-
ition, playfield 2 covers the sprite because of its priority.
In single playfield mode, the bit 6-3 have no function, and should
be set to zero. Bits 2-0 still control sprite priorities, and the position of
the single playfield relative to the sprites.
Sprite Collision:those readers wishing that they had never bothered
with sprite handling after reaching this point due to the complexity of the
sprite management system are about to burst into tears over sprite collision.
Firstly, let us overview the basic principles of object collision.
The fundamental principle of graphic element collision is this:when
two graphic elements overlap at a screen position, and both objects have a
set pixel at the same screen position, this is treated as a collision bet-
ween the two graphic elements. More sophisticated collision algorithms for
certain purposes do exist, but these will be ignored here, as they are not
implemented in the Amiga hardware, as will the coordinate comparison algo-
rithm (which is simpler still in some respects and very quick if speed is of
the essence).
When a collision between graphic elements occurs on the Amiga, it
is signalled by setting a bit in the CLXDAT register (offset $00E), which is
a read-only register from the 68000's point of view (only the sprite manage-
ment and blitter DMA can write to this register). The bit allocations of
CLXDAT are as follows:
Bit Function
--- --------
15 Unused
14 Sprite 4/5 collides with sprite 6/7
13 Sprite 2/3 collides with sprite 6/7
12 Sprite 2/3 collides with sprite 4/5
11 Sprite 0/1 collides with sprite 6/7
10 Sprite 0/1 collides with sprite 4/5
9 Sprite 0/1 collides with sprite 2/3
8 Playfield 2 collides with sprite 6/7
7 Playfield 2 collides with sprite 4/5
6 Playfield 2 collides with sprite 2/3
5 Playfield 2 collides with sprite 0/1
4 Playfield 1 collides with sprite 6/7
3 Playfield 1 collides with sprite 4/5
2 Playfield 1 collides with sprite 2/3
1 Playfield 1 collides with sprite 0/1
0 Playfield 1 collides with playfield 2
The rules for collision detection are that any non-transparent sprite pixel
can cause a collision to be registered. However, it is possible to decide by
appropriate programming to choose which playfield bitplanes are used in the
determination of collision detection. Also, it is possible to include or ex-
clude any odd-numbered sprite from collision detection. Which graphic ele-
ments are used for collision detection purposes is decided by programming the
CLXCON register (offset $098), which is a write-only register from the point
of view of the 68000. The bit allocations for CLXCON are as follows:
Bit Function
--- --------
15 Enable collision detection, sprite 7
14 Enable collision detection, sprite 5
13 Enable collision detection, sprite 3
12 Enable collision detection, sprite 1
11 Use bitplane 6 for collision detection
10 Use bitplane 5 for collision detection
9 Use bitplane 4 for collision detection
8 Use bitplane 3 for collision detection
7 Use bitplane 2 for collision detection
6 Use bitplane 1 for collision detection
5 Bitplane 6 collision match bit
4 Bitplane 5 collision match bit
3 Bitplane 4 collision match bit
2 Bitplane 3 collision match bit
1 Bitplane 2 collision match bit
0 Bitplane 1 collision mask bit
The first problem the programmer encounters is that collisions between adja-
cent numbered sprites used for sprite attachment cannot be performed. Thus
a collision between sprites 0 and 1, for example, will not be registered in
the CLXDAT register. Collisions between sprite 0 and any sprite from 2 to 7,
or between sprite 1 and sprites 2 to 7, will be registered if the appropriate
control bits are set. If the bit to enable sprite 1 collision detection is
cleared, only sprite 0 collisions between other sprites and/or the playfields
will be reported. If the bit is set, then BOTH sprite 0 AND sprite 1 collis-
ions will be reported, and furthermore reported in the same CLXDAT bit! Thus
choice of sprites needs to be handled carefully if distinctions between the
sprites are important for collision purposes, because in the example I have
just cited, sprites 0 & 1 use the same bit of CLXDAT for collision detection
and thus telling them apart is impossible by merely scanning CLXDAT. If two
sprites have been combined into a single 15-colour sprite using the ATTACH
bit, the corresponding bits for odd sprite collision detection must be set
in CLXCON in order for collision detection to be performed correctly.
For the playfields, the level of control is much more complete. If a
given bit is set in the CLXCON register above from bits 11 to 6, the relevant
bitplanes will be used in collision detection. Bits 5 to 0 are called Match
Bit Plane Value bits, and are used to determine what values to use for the
comparison before reporting collision detection.
Let us assume that we are using 6 bitplanes, and that the BPLCON0
value has enabled all 6 bitplane DMA channels (see the Bitplane Control sec-
tion above). If one of the bits 11 to 6 (called the Enable Bitplane bits)
is set, that bitplane is used. The corresponding match bit (bits 5 to 0) in
the CLXCON register is then used to compare with the given pixel. Let us ass-
ume that the Enable bit for bitplane 3 (bit 8 of CLXCON) is set. If the value
of the pixel data on bitplane 3 at the collision detection point MATCHES the
value of the match bit for bitplane 3 in CLXCON (bit 2) then a collision is
reported. So if the pixel bit is 0, and the match bit is 0 also, the collis-
ion is reported, as in the case when the pixel bit is 1 and the match bit in
CLXCON is 1. If we don't care about a particular bitplane (say for example we
wish to ignore bitplane 1 altogether for collision detection), clear the En-
able bitplane bit for bitplane 1 (bit 6). Now, the value of bit 0 of CLXCON
doesn't matter-the collision will be reported regardless of the state of
the pixels on bitplane 1.
The table given in the Amiga Systems Programmer's Guide is repro-
duced here for those who want it. It correlates directly with the above des-
cription of bitplane selection & collision detection control. The Enable Bit-
plane bits are referred to as ENBPx, the Match Bitplane Value bits as MVBPx.
The 'xx' bits in the table below are "don't care" bits-they can be either 0
or 1.
ENBPx MVBPx Collision possible with bit pattern
----- ----- -----------------------------------
111111 111111 111111 only
111111 111000 111000 only
111100 1111xx 111100, 111101, 111110, 111111 only
011111 x00000 000000, 100000 only
000000 xxxxxx Any bit pattern!!!
Take note, that if fewer than 6 bitplanes are used, the ENBPx bits for the
unused bitplanes MUST be set to zero!
Needless to say, if the colours are chosen suitably, various colli-
sion strategies based upon colour can be constructed, as well as strategies
based directly upon bitplane management. It is possible, for example, to set
sprite collision to register with only red and green pixels of a playfield,
or collision with the transparent points of playfield 1 to register only if
the underlying pixels of playfield 2 are black.
Spurious Sprite Video Data:occasionally software that directly con-
trols the hardware suffers from the appearance of a line down the screen at
some point. Analysis usually (but not always) shows that this line corres-
ponds in position to the position of the CLI sprite pointer prior to swit-
ching off sprite DMA. If this is done before the sprite management system has
finished displaying the CLI sprite pointer, then when sprite DMA is turned
off, the sprite management system cannot read the end control words, and thus
continues displaying the sprite data onscreen. Even when sprite DMA is re-
enabled, the contents of the sprite data pointers may not point to zero con-
trol words, in which case spurious sprite video data may continue to appear
until the sprite data pointers are changed.
There are two ways to prevent this. The first technique is to point
the sprite data pointer registers at zero control words. The second technique
(thanks to Count Zero) is to wait for the electron beam to reach a position
beyond the maximum possible display position of the CLI sprite pointer, and
then turn off the sprite DMA (use a raster line value of 300 for PAL Amigas).
Other Sprite Registers:there exist other sprite data registers, that
are normally accessed by the sprite management system DMA alone. These regis-
ters can be accessed in software by the 68000 also, and these registers are:
SPR0POS (offset $140)
SPR0CTL (offset $142)
SPR0DATA (offset $144)
SPR0DATB (offset $146)
... ...
SPR7POS (offset $178)
SPR7CTL (offset $17A)
SPR7DATA (offset $17C)
SPR7DATB (offset $17E)
The registers for SPRxPOS onwards occupy the entire range of offsets from
$140 to $17E, in ascending sprite number order.
When the programmer assigns sprite management control to the stan-
dard sprite management DMA channels, the sequence of events is:
1) DMA system loads two control words into SPRxPOS (control
word 1) and SPRxCTL (control word 2).
2) DMA system turns off sprite output.
3) DMA controller waits for the electron beam to reach the
value in the VSTART portion of the sprite control words.
4) Once this position is reached, data words are written
into SPRxDATA and SPRDATB.
5) DMA controller turns on sprite output again, and the
values in SPRxDATA and SPRxDATB are used for the
current raster line. These are positioned according to
the HSTART value.
6) DMA controller continues reading data into SPRxDATA/B
and displaying it until the VSTOP value is reached.
7) The DMA controller reads the two control words at the
end of the sprite data list. If these are non-zero, the
sprite data channel is being re-used, and the sequence
of events begins again at 1).
8) If the DMA system encounters the two zero control words
at the end of the sprite data list, the DMA controller
turns off sprite data output on this DMA channel until
the vertical blank interval occurs, at which point the
sprite DMA begins its display sequence at 1) again.
If the programmer uses the 68000 to access these registers, normally used
solely by the DMA controller (but accessible to the 68000), then there are
a few changes to take note of. First, the sprite data pointer registers need
to be initialised as for sprite management via DMA-this remains the same.
The sprite data list contents change somewhat, however. If the 68000 is used
to load the SPRxPOS/SPRxCTL registers, then only the HSTART value in the
sprite data list control words, plus the value of the ATTACH bit, need to be
valid. VSTART and VSTOP are used only by the DMA controller.
Sprite data output can then being by writing data to the SPRxDATA/B
registers. Write to SPRxDATB first, as writing to SPRxDATA causes the data
from both registers to be output to the screen. Note that if the DMA control-
ler is bypassed in this way, that fresh data needs to be supplied for each
line of the sprite to the SPRxDATA/B registers if needed (normally this is
performed by the DMA controller) unless all that is wanted is a solid column
of identical pixel data.
To turn off the sprite again, simply write some value to SPRxPOS. I
suggest writing the value zero.
Hardware:The Blitter
The blitter is THE chip that makes the Amiga so special, and it will come as
no surprise to realise that this section will probably be the largest section
in this file. It has the capacity to move data at a peak speed of 16 million
pixels per second, perform logical operations upon its data sources before
generating its output, and is used for three principal functions:
1) Transferring rectangular graphic data blocks to screen
bitplane memory (with logical operations to change the
plotting method)
2) Drawing lines between any two points on screen
3) Filling bounded areas (taking account of some restrictions)
to create filled polygon shapes
This is not its entire repertoire, however. The blitter contains all of the
on-chip logic necessary to perform vector-arcitecture mathematics, but is
prevented from doing so directly by its hard-wired design. However, it is
possible to make the blitter perform high-speed computational functions on
large blocks of data in one go, provided that one is familiar with Boolean
Algebra and has at least a first-year university level grounding in formal
logic up to the level of alternational normal schemata (I recommend as THE
definitive text on formal logic for those interested to be 'Methods In Logic'
by Willard Van Ormand Quine - ask for the Library of Congress record number
in preference to the ISBN number as it's an American publication).
Having whetted the appetite, now comes a small amount of bad news.
The blitter is powerful, but its power has associated with it a certain am-
ount of complexity. In particular, although it has a well-defined set of
registers for function control, the bit allocations of the BLTCONx control
registers (see register list below) change dramatically with each function.
There are also several rigid conventions to follow, otherwise the blitter
may just scribble at high speed all over critical program memory, as once
started up, it cannot be stopped halfway.
Blitter Register List:the list of registers associated with the
blitter is:
Register Offset Function
-------- ------ --------
BLTDDAT 000 Blitter Data D (read-only!)
BLTCON0 040 Blitter Control Register 0
BLTCON1 042 Blitter Control Register 1
BLTAFWM 044 Blitter A first word mask
BLTALWM 046 Blitter A last word mask
BLTCPTH 048 Source C data pointer high word
BLTCPTL 04A Source C data pointer low word
BLTBPTH 04C Source B data pointer high word
BLTBPTL 04E Source B data pointer low word
BLTAPTH 050 Source A data pointer high word
BLTAPTL 052 Source A data pointer low word
BLTDPTH 054 Destination D data pointer high word
BLTDPTL 056 Destination D data pointer low word
BLTSIZE 058 Controls data size/starts blitter
BLTCMOD 060 Source C modulo register
BLTBMOD 062 Source B modulo register
BLTAMOD 064 Source A modulo register
BLTDMOD 066 Destination D modulo register
BLTCDAT 070 Blitter Source C Data
BLTBDAT 072 Blitter Source B Data
BLTADAT 074 Blitter Source A Data
Some of these registers are not accessed during standard blitter usage. They
do come into play for some of the less well-publicised functions for which
the blitter is used. More of this later.
Rectangular Data Block Moving:this is the first blitter function to
be dealt with in this section, and a preamble will help introduce certain key
concepts.
The blitter, when operating in data copy mode, takes data from up to
three different source areas of memory, combines them using a logical opera-
tion, and then writes the data out to the destination memory area. The main
use of this function is for copying large blocks of graphic data to bitplane
memory for display, or copying portions of bitplane memory to safe off-screen
areas for background saving.
In general, the source memory is linearly organised, and the graphic
data occurs in sequential words in memory. At this point I stress that the
blitter is a WORD-based device, and that all of the blitter's activities are
based upon word-aligned memory, just like the 68000's program access. Return-
ing to the graphic data, this linear organisation means that the data can be
read sequentially without any problems.
However, to illustrate the blitter methodology, let there exist a
low-resolution screen of one bitplane, and some graphic data that is to be
written to the screen. The low-resolution bitplane is 40 bytes across, or 20
words, and the graphic data has a maximum width of 48 pixels (3 words). The
pointer to the word on the screen bitplane where the data is to be put is
written to the BLTDPTH/L registers, and the pointer to the graphic data is
written to the BLTAPTH/L registers. For now let us ignore other sources and
the actual details of the logical operation used. The blitter functions by
using the address pointers to fetch a data word from each source (here we are
using one source only, source A), and after processing it internally, uses
the pointer to the destination to write the data to the screen. It then inc-
rements the pointers by 2 to access the next word. So for each blitter oper-
ation, the pointers MUST be re-initialised (unless they happen to end up
pointing to memory areas to be referenced by another blitter operation, in
which case the blitter can simply be started up again).
For the graphic data, this poses no problem, since it is sequential-
ly organised in memory. But after writing 3 words of data, the blitter must
have a correction added to the destination pointer to point to the next ras-
ter line to write to. This correction is called the modulo, and is stored in
the modulo register for the appropriate source and destination. In this exam-
ple, the blitter needs to have a correction of 17 words added to the destina-
tion pointer to reference the screen memory correctly, or 34 bytes. Thus the
modulo for the destination, BLTDMOD, is set to 34. The modulo for source A,
BLTAMOD, is set to zero. If other sources are used, the appropriate modulo
must be selected. If source B is used as a graphic data mask organised in the
same way as the actual pixel data, its modulo is again zero. If the source C
is used as a reference to the screen for background masking-in, its modulo
must be the same as the destination modulo, i.e., 34 in this example.
So, the organisation of source and destination memory needs to be
analysed before setting up both the BLTxPTH/L registers, and the BLTxMOD reg-
isters. This information should be sufficient to cover initialisation of the
registers just mentioned.
Now we need to decide how much data to transfer. The blitter accepts
this information as a single word, coded as follows:
HHHHHHHHHHWWWWWW
The H bits represent the height of the graphic data. This can take any value
from 0 to 1023 lines. Zero is taken to mean the maximum size of 1024 lines,
as a value of zero lines is otherwise silly. The W bits represent the width
of the graphic data in memory words, in our example this is 3. So if our data
is 40 raster lines deep, the value of the H bits is 40, and the value of the
W bits is 3. This gives a final value of
(40 * 64) + 3 = $0A03
This value is written to the BLTSIZE register in the table above. The method
I use for computing BLTSIZE is something of the order of
move.w rows(a0),d0 ;no of raster lines
and.w #$3FF,d0 ;ensure value is 0-1023
lsl.w #6,d0 ;shift
move.w cols(a6),d1 ;no of WORDS across!
and.w #$3F,d1 ;ensure value is 0-63
add.w d1,d0 ;add it in
move.w d0,BLTSIZE(a5) ;here a5 contains $DFF000...
However, writing to the BLTSIZE register starts the blitter! So this
must be the LAST operation performed upon the blitter registers for a given
blitter operation.
Now we need to consider logical operations. It is of great assist-
ance if the programmer has a thorough grounding in Boolean Algebra at this
point, since the method used by the blitter relies heavily upon this. The
terminology used for each logical operation selected is 'minterm', which is
short and sweet. The formal technique for deriving minterms labours under the
unfortunate name of 'developed alternation normal schema formation', a fair
mouthful for anyone to handle.
The simple way of thinking about this is to remember that the blit-
ter has 8 possible logical operations hard-wired into it. These operations
are:
ABC : A AND B AND C
ABc : A AND B AND (NOT C)
AbC : A AND (NOT B) AND C
Abc : A AND (NOT B) AND (NOT C)
aBC : (NOT A) AND B AND C
aBc : (NOT A) AND B AND (NOT C)
abC : (NOT A) AND (NOT B) AND C
abc : (NOT A) AND (NOT B) AND (NOT C)
The next table shows which of the operations produce a 1 (true) bit output
dependent upon the values of the input bits:
Operation A B C LFx Bit No.
--------- - - - ----------
ABC 1 1 1 7
ABc 1 1 0 6
AbC 1 0 1 5
Abc 1 0 0 4
aBC 0 1 1 3
aBc 0 1 0 2
abC 0 0 1 1
abc 0 0 0 0
When the blitter processes its data from each source, A,B,C, it feeds the
data into circuits whose outputs are each of the 8 logical operations above.
These are then ORed together to produce the final result sent to the dest-
ination memory. Which ones are selected to OR together are under the control
of the programmer. The LFx bit numbers in the table above are used to select
them, and a byte containing the approriate bits set selects the given logical
operation. So, to select ABC+ABc+AbC+Abc, one uses the select byte $F0 (or,
%11110000 in binary).
But how do we select them? In our example, we want the destination
to match the source input. In other words, D = A. But since B and C are al-
ways present in the blitter, how do we introduce them? Well, we don't care
about the state of B, so we want operations containing both AB and Ab. In a
like manner, we don't care about C, so we want operations containing AC and
Ac. This is the same as performing the Boolean Algebra operations (here, + is
taken to mean OR, and AB is taken to mean A AND B):
A(B+b) = AB + Ab
A(C+c) = AC + Ac
But we want terms of the form ABC etc. Well, take the first expression
A(B+b) = AB + Ab
and append the (C+c) used in the second:
A(B+b)(C+c) = AB(C+c) + Ab(C+c)
= ABC +ABc + AbC + Abc
This just happens to be the example encoded as the select byte $F0 above. The
principle is the same throughout. Some more examples are:
Invert Graphic Data : D = a
D = a
= a(B+b)(C+c)
= aB(C+c) + ab(C+c)
= aBC + aBc + abC + abc
Which is encoded as the select byte $0F.
OR in a graphic into the bitplane : D = A + C
D = A + C
= A(B+b)(C+c) + C(A+a)(B+b)
= AB(C+c) + Ab(C+c) + CA(B+b) + Ca(B+b)
= ABC + ABc + AbC + Abc + CAB + CAb + CaB + Cab
= ABC + ABc + AbC + Abc + ABC + AbC + aBC + abC
= ABC + ABc + AbC + Abc + aBC + abC
Which is encoded as the select byte $FA. Note that where identical terms
appear in the expression above, surplus ones are simply deleted from the
expression.
The last example I shall give is the so-called 'cookie-cut' opera-
tion. This name originates from the way cookie biscuits are cut from the
biscuit mixture when making chocolate chip cookies, familiar to any American
especially one who has encountered 'Girl Ranger' cookies. This operation is
one where the data A is masked first. If the data A is 1 at this point, we
want the masked data to be written. If the data A is 0, we want it to be re-
garded as transparent, and hence the background to show through. This allows
the mask to create opaque 0 pixels within the data, and any 0 pixels in the
A data to be regarded as transparent by having the corrsponding mask pixel
set to 1. The operation becomes:
Cookie cut : D = AB + aC
D = AB(C+c) + a(B+b)C
= ABC + ABc + aBC + abC
which encodes as the select byte $CA. In this case, the mask will be a nega-
tive image (photographically speaking) of the graphic data except where any
transparent pixels are required.
So, we now know how to select which memory areas to transfer, how to
set modulo values for trasnfer to bitplanes, how to determine the data size
(and also start the blitter), and select the appropriate logical operation. I
now wish to intoduce the blitter control registers. These affect the manner
in which the blitter works. The two blitter control registers, BLTCON0 and
BLTCON1, have control bits allocated according to the following tables:
Bit BLTCON0 Function
--- ----------------
15-12 ASH3-0 : contain the source A shift distance (see later)
11-8 USEA-D : select which sources/destination are used
7-0 LFx : logical function selection bits mentioned above
Bit BLTCON1 Function
--- ----------------
15-12 BSH3-0 : contain the source B shift distance (see later)
11-5 Unused
4 EFE : Exclusive Fill Enable
3 IFE : Inclusive Fill Enable
2 FCI : Fill Carry In
1 DESC : Descending mode
0 LINE : Turn on Line Drawing Mode
In the case of BLTCON0, the LFx bits that select the minterms have already
been covered. This leaves the USEA-D bits and the ASH3-0 bits.
The USEx bits determine which sources/destination are used. If a
given USEx bit is set, the DMA control system for that source (or destin-
ation) is fully enabled, in which case the blitter operation proceeds norm-
ally as described earlier. If a source is not being used at all, clear the
appropriate USEx bit. This has the effect of disabling the full effects of
the DMA channel, but NOT of stopping data transfer altogether. Because of
this, minterms have to be chosen to effectively ignore the given source, as
well as clearing the given USEx bit. Instead, the same word contained in
the BLTxDAT registers for the given source is read continuously and not up-
dated. This can be used to fill memory with a given value, as follows:
move.w #value,BLTADAT(a5) ;value to fill
move.w #$01F0,BLTCON0(a5) ;minterms D=A, USED only
move.w #0,BLTCON1(a5) ;mode = copy data
lea memory_to_fill(pc),a0 ;start of memory area
move.l a0,BLTDPTH(a5) ;point Blitter at it
move.w #size,BLTSIZE(a5) ;how much to fill & startup!
If copying graphic data, pick the USEx bits carefully. Ensure that if you ARE
using a given source, set it's USEx bit. And whatever you do, don't forget to
set USED to enable the destination output (the number of programmers who have
forgotten to set USED at some time doesn't bear thinking about) or else the
blitter won't be able to generate the desired output!
The ASH3-0 bits are used to 'fine-tune' the blitter data positioning
for graphic output. Normally, the blitter can only output its data to a word
boundary, being a word-oriented device. To enable pixel-boundary data plot-
ting, the blitter has the ability to shift input data before outputting it.
The ASH3-0 bits contain the number of bit positions to shift the data from
source A to the right before outputting it. The bits BSH3-0 in BLTCON1 have
an almost identical function, this time affecting source B. Hence source A
is usually chosen to be the graphic data, and source B any mask data used. If
background data is used for transparency generation, source C is generally
the preferred choice. Both graphic and mask are shifted before output.
Moving on to BLTCON1, for data copying, set all bits other than the
BSH3-0 bits to zero. These bits are used for line drawing mode and boundary-
filling mode, covered later.
To complete the register list, there are two mask registers for the
data source A. These are BLTAFWM (Blitter source A first word mask) and
BLTALWM (Blitter source A last word mask). If these registers are both zero,
the first and last word of each raster line of copied graphic data will be
zeroed out-they are used as filter masks for the left and right edges of a
graphic data block. If the graphic is one data word wide, the two registers
operate upon the same source A data word, and only those source A bits that
have the corresponding bits set in BOTH mask registers are allowed through
unchanged (otherwise they are treated as zero). For wider graphic data blocks
the mask registers mask the end words of the data. As yet, I have not had
time to establish conclusively whether source A masking is performed BEFORE
or AFTER source A data shifting under BLTCON0 control, and as the result is
critically dependent upon this, I suggest experimentation before assuming one
way or the other.
As a final note concerning the blitter in data copy mode, I said
above that the remaining BLTCON1 bits other than the BSH3-0 bits should be
cleared. Normally this is the case (using the blitter for graphic data manip-
ulation), but if the blitter is used to move memory blocks, particularly if
the data blocks are overlapping, then it is time to consider setting the
DESC bit.
The DESC bit of BLTCON1 controls whether the address incrementing of
the BLTxPTH/L registers is positive (ascending mode, DESC=0) or negative (de-
scending mode, DESC=1). An example illustrates the point. Let us copy a block
of memory of size N bytes (N will have to be even for blitter copy), starting
at address X, to a new location at address Y. If X lies lower in memory than
Y, but the difference between X and Y is less than N bytes, then the initial
copy operation will erase some of the data at the end of the block to be cop-
ied. To prevent this, it is possible to copy backwards by setting DESC=1. If
this is done, however, the pointer registers should be set to addresses X+N
and Y+N instead of addresses X and Y, because copying will start at the END
of the blocks.
NOTE:until recently, I assumed that the choice of Source A for the
graphic data, Source B for the mask, and Source C for the background when a
masked blit was performed was OK. This has transpired to be incorrect. Source
channel selection SHOULD be:
Source A = MASK
Source B = DATA
Source C = BACKGROUND
If this order of selection is made, then the standard minterms mentioned in
the Amiga Hardware Reference Manual hold true, e.g., $CA is the 'cookie-cut'
minterm. See later for an example.
Line Drawing:the blitter has inbuilt line-drawing logic which was
added by its designer, Jay Miner, after discovering that the other functions
left space on the silicon for the line-drawing hardware (wow-history lesson
too!) and decided that this would be a welcome feature to include.
The problem with the blitter's line-drawing logic is that it uses a
method unfamiliar to anyone having no experience of hardware geometry engines
(a generic term used to describe high-speed graphics-specific processors). A
line drawn with the blitter has to be described in a manner conforming to the
method used. I shall try to make this simple, but if it seems hard going, I
implore you to persevere-this imformation will apply to far more sophistica-
ted geometry engines as well and hence has a wider application.
Lines are represented under normal circumstances using the start and
end coordinates. If the end points of the line being drawn are P (x1,y1) and
Q (x2,y2), this is usually sufficient information. The alternative informa-
tion used by the blitter and other geometry engines is 1) the address of the
memory word in the bitplane where the start point P lies; 2) the number of
points that the line will occupy once drawn; 3) the angular orientation of
the line represented in terms of which 45 degree compass segment (or octant)
that the said angle lies in (measured anticlockwise from zero degrees from
the X-axis:in this system of measurement 90 degrees is due North, 180 degrees
is due West, 270 degrees is due South etc). These compass segments, or oct-
ants, are defined using octant numbers, allocated according to the table:
Angle Range Octant No.
----------- ---------
0-45 degrees 0
45-90 degrees 1
90-135 degrees 2
135-180 degrees 3
180-225 degrees 4
225-270 degrees 5
270-315 degrees 6
315-360 degrees 7
Note that this table assumes that the Y-axis is drawn to be positive
when pointing DOWNWARDS, thus setting up the usual screen coordinate system
where (0,0) is the TOP LEFT CORNER of the screen.
The octant information is insufficient in itself, however. Before I
explain the relationship between the octant numbers and the data actually
sent to the blitter, I shall give the equations describing the line that are
used by the blitter. These are :
dX = X2 - X1 These are used to determine
dY = Y2 - Y1 which octant the line lies in
DX = ABS (dX) DX = Delta X
DY = ABS (dY) DY = Delta Y
DS = MIN(DX,DY) DS = Delta S (smaller Delta)
DL = MAX(DX,DY) DL = Delta L (larger Delta)
These expressions form the basis of geometry engine line drawing (including
that of the blitter, even if it cannot be regarded as a true geometry engine
in the same manner as, for example, the Weitek 1164/1165 series used in the
1167 accelerator board for the Compaq DeskPro 386/20 PC, or the TMS34020).
Now for the hard part. Delta X, Delta Y and Delta S/L are used to
determine which octant should be selected for the line. This is done by the
conversion of the octant number into a 3-bit number, which corresponds to
three bits in the BLTCON1 register as used for line drawing control. How the
bits in BLTCON1 are allocated changes when line drawing is performed. The
bit allocations are:
Bit No Name Function
------ ---- --------
15-12 TEXTURE3-0 Value for mask shifting (see
below)
11-7 Unused Always set to zero
6 SIGN Changes line drawing direction
5 Unused Always set to zero
4 SUL Sometimes Up or Left
3 SUD Sometimes Up or Down
2 AUL Always Up or Left
1 SING Singular bit (see below)
0 LINE Always set to 1 for line
drawing
The SUD/SUL/AUL bits are set or cleared according to the following values:
Octant No. SUD SUL AUL
--------- --- --- ---
0 1 0 0
1 0 0 0
2 1 1 0
3 0 0 1
4 1 0 1
5 0 1 0
6 1 1 1
7 0 1 1
and the SIGN bit is set if the computed value of (2 * DS) - DL is less than
zero (to change the direction in which the line is rendered for lines with a
negative slope).
Under normal circumstances, the USEA-D bits in BLTCON0 should be set
to the values:
USEA = 1
USEB = 0
USEC = 1
USED = 1
and the minterms set to $CA. The value of the ASH3-0 bits is set to the value
x1 MOD 16
to determine which bit in the start word is the start bit of the line, and in
most literature on blitter line drawing these bits change their name to the
START3-0 bits.
The line can be drawn using a mask, to provide dotted lines accord-
ing to a line dot pattern. This mask is written to BLTBDAT (see register list
above), and for solid lines the value to use is $FFFF. A value of $AAAA or
$5555 produces a finely dotted line, a value of $CCCC a more coarsely dotted
line.
So, to give the blitter register initialisation values. These are as
follows (noting the values DX, DY, DS and DL above):
BLTCPTH/L, BLTDPTH/L : Put the start address of the first
point of the line in these registers.
BLTCMOD, BLTDMOD : Number of bytes making up one raster
line of the bitplane within which the
line is to be rendered. For a low-res
bitplane this is 40. See example code
below.
BLTBMOD : Set to 2 * DS.
BLTAMOD : Set to (2 * DS) - (2 * DL).
BLTAPTL : Set to (2 * DS) - DL.
BLTADAT : Set to $8000
BLTBDAT : Set to your chosen pattern mask
BLTAFWM : Always set to $FFFF
BLTSIZE : Last register initialised (it sets the
blitter going). Set width bits equal to
2 always. Set height bits equal to DL.
Hence value to use is (DL * 64) + 2.
One final point. If the blitter is used to draw the closed border of
a polygon, which is then to be filled by the blitter in boundary-fill mode as
documented later, then the line should be drawn with only one pixel set per
raster line. The blitter provides a line-drawing mode specially for this. To
draw lines normally, set the SING bit in BLTCON1 to zero, and to draw lines
in this special mode, set the SING bit to 1.
The following code can be etched out of this file, and used at will.
It is a drawline() routine complete with documentation which illustrates all
of the above concepts in action. Note, in order to use this code, it is best
to either kill off the operating system altogether or use Forbid() to ensure
that your task is the only one running within the system. My test code using
this kills off Exec, but I haven't included this in case anyone wishes to use
this code and keep the operating system alive.
* Blitter Line Drawing Code : Data Structures and Routines
* Assumes that all rendering is done in a low resolution
* screen of 4 bitplanes depth.
* new line structure definition V2.0. Can be defined statically using
* DC.W or generated by your program as required. Note that this code
* allows multiple lines to be drawn one after the other, and even allows
* mixing of SING and normal lines if wanted as well as lines of different
* colours in sequence!
rsreset
line_screen rs.l 1 ;ptr to 1st screen bitplane
;(assumes continuity)
line_ssize rs.w 1 ;size of 1 bitplane in bytes
line_smod rs.w 1 ;screen modulo
line_coords rs.l 1 ;ptr to line coord list
;workspace entries for drawline() routine
line_deltax rs.w 1
line_deltay rs.w 1
line_2S_L rs.w 1
line_oct_bits rs.b 1
line_pad rs.b 1
line_sizeof rs.w 0
* line coord list structure
rsreset
lc_next rs.l 1 ;=ptr to next coord list entry,
;0 if last in list
lc_x1 rs.w 1
lc_y1 rs.w 1 ;coords of start point
lc_x2 rs.w 1
lc_y2 rs.w 1 ;coords of end point
lc_pattern rs.w 1 ;should you want a dotted line...
lc_bits rs.b 1 ;bitplanes & SING bit if wanted
lc_pad rs.b 1 ;padding byte for alignment
lc_sizeof rs.w 0
* lc_bits : 0-3 = colour (bitplanes in which drawn)
* : 4 = SING bit for line draw
;drawline(a4,a5) a4 = ptr to line structure definition V2.0
;a5 = ptr to custom chip registers
;draws line(s) according to the contents of the line definition
;structure(s). Note : line definition structure is a header structure
;containing workspace used by drawline() in this version. Actual
;coordinate and bitplane information etc., contained in separate list
;pointed to by an entry in the line definition structure.
;Core algorithm from Amiga System Programmer's Guide. Several
;addenda of my own for multiple bitplane handling, SING mode
;drawing etc. (SING mode needed for polygon drawing prior to
;polygon fill-see elsewhere for more info).
;d0-d7/a0-a3 corrupted
drawline move.l line_coords(a4),a3
drawline_l0 moveq #0,d1 ;clear line octant selector
move.w lc_x2(a3),d0
sub.w lc_x1(a3),d0 ;compute deltax = x2-x1
roxl.w #1,d1 ;condition octant selector
tst.w d0 ;>0 or <0?
bge.s drawline_b1 ;>=0 so skip
neg.w d0 ;else absolute value
drawline_b1 move.w d0,line_deltax(a4)
move.w lc_y2(a3),d0
sub.w lc_y1(a3),d0 ;compute deltay = y2-y1
roxl.w #1,d1 ;condition octant selector
tst.w d0
bge.s drawline_b2
neg.w d0 ;absolute value again
drawline_b2 move.w d0,line_deltay(a4)
move.w line_deltax(a4),d2
move.w d2,d3
sub.w d0,d3 ;want largest of Dx,Dy
roxl.w #1,d1 ;condition octant selector
tst.w d3
bge.s drawline_b3
exg d0,d2 ;ensure smallest of the
;two in d0
;From here on, DS = Delta S, DL = Delta L, as in the book. DS = smallest
;of Dx, Dy, and DL = largest of Dx, Dy. I reuse the line_deltax(a4)
;entries in the structure for these, simply changing the order in
;which they appear if needed instead of having separate line_deltas()
;and line_deltal() entries. Trivial really.
;Some stuff is pre-calculated, and then saved in workspace entries
;provided in the line definition structure for this purpose.
drawline_b3 movem.w d0/d2,line_deltax(a4) Dx = DS, Dy = DL
lea octants(pc),a0 ;ptr to octant selector table
add.w d1,a0
clr.w d1
move.b (a0),d1 ;get octant code
move.b lc_bits(a3),d0
and.b #$10,d0 ;get SING bit
lsr.b #3,d0
or.b d0,d1 ;put in eventual blitter
;control bits
asl.w line_deltax(a4) ;2*DS
move.w line_deltax(a4),d0
sub.w line_deltay(a4),d0 ;2*DS - DL
bge.s drawline_b4
or.b #$40,d1 ;set SIGN bit if needed
drawline_b4 move.b d1,line_oct_bits(a4) ;save BLTCON1 bits
;for later
move.w d0,line_2S_L(a4) ;save 2*DS-DL
move.l line_screen(a4),a0 ;screen pointer
move.w d0,d1
sub.w line_deltay(a4),d1 ;2*DS - 2*DL
move.w lc_y1(a3),d2
mulu line_smod(a4),d2 ;y1 * bitplane size
move.w lc_x1(a3),d3
asr.w #4,d3
add.w d3,d3 ;2*int(x1/16)
ext.l d3
add.l d3,d2 ;bitplane offset
move.w line_deltax(a4),d3 ;2*DS
moveq #0,d4
move.w lc_x1(a3),d4
and.w #$F,d4 ;frac(x1/16)
ror.w #4,d4 ;create STARTx bits
move.w d4,d5
swap d4
move.w d5,d4 ;copy to TEXTUREx bits
or.b line_oct_bits(a4),d4 ;create BLTCON1 bits
swap d4
or.w #$BCA,d4 ;create BLTCON0 bits
swap d4
move.w line_deltay(a4),d5 ;get DL
moveq #3,d7 ;no of bitplanes - 1
;NB : trick here. Upper word of d7=0 after moveq #3,d7. Use this as
;the bitplane bit number, doing an addq.w #1,d7 each time, and using
;swap d7 to alternate between bitplane number counter & bitplane bit
;position counter.
drawline_l1 swap d7 ;get bitplane bit number
move.w d7,d6 ;ready for test
addq.w #1,d7 ;next bitplane number
swap d7 ;back to loop counter
btst d6,lc_bits(a3) ;bitplane flag set?
beq.s drawline_a1 ;no-get next
drawline_b5 btst #6,DMACONR(a5) ;blitter ready?
bne.s drawline_b5
move.l a0,a1 ;bitplane pointer
add.l d2,a1 ;offset to 1st word of line
move.w d0,BLTAPTL(a5) ;2*DS-DL
move.w d1,BLTAMOD(a5) ;2*DS - 2*DL
move.l a1,BLTCPTH(a5)
move.l a1,BLTDPTH(a5) ;bitplane pointers proper
move.w d3,BLTBMOD(a5) ;2*DS
move.l #-1,BLTAFWM(a5) ;set masks
move.w #$8000,BLTADAT(a5) ;1 bit must be set
move.w line_smod(a4),BLTCMOD(a5)
move.w line_smod(a4),BLTDMOD(a5) ;bitplane moduli!
move.l d4,BLTCON0(a5) ;set blitter control regs!
move.w lc_pattern(a3),BLTBDAT(a5) ;line pattern
move.w d5,d6
lsl.w #6,d6
addq.w #2,d6 ;BLTSIZE = 64*DL+2
move.w d6,BLTSIZE(a5) ;draw line
drawline_a1 add.w line_ssize(a4),a0 ;next bitplane pointer
dbra d7,drawline_l1
move.l lc_next(a3),d0 ;check if more lines to do
beq.s drawline_b6 ;none so exit
move.l d0,a3 ;else set pointer
bra drawline_l0 ;and do it
drawline_b6 rts
octants dc.b 4*4+1
dc.b 0*4+1
dc.b 6*4+1
dc.b 1*4+1
dc.b 5*4+1
dc.b 2*4+1
dc.b 7*4+1
dc.b 3*4+1
even
Before documenting the boundary-fill mode of the blitter, some ideas for fut-
ure experimentation include : using different minterms as described in the
Amiga System Programmers' Guide, and tinkering directly with the SING bit to
examine its effects. Also, the SIGN bit's effects can be examined, and the
effect of using values other than $8000 in BLTADAT. I have not tried all of
these, so exercise care. Some of the result could be very interesting assum-
ing that the Amiga doesn't conk out under the strain...
Boundary-filling:the blitter has a boundary-fill mode which makes
the construction of filled polygons quite simple, once the vagaries of which
registers have which values are dealt with.
The blitter's boundary-fill operation is very simple-minded. When it
fills a boundary, it recognises the boundary by virtue of the existence of a
single pixel marking the boundary. Once it has found that single pixel, the
blitter fills all blank pixels until it encounters another single pixel mark-
ing the end of the boundary. More correctly, it uses the value of the FCI bit
(the Fill Carry In bit) of BLTCON1 to determine what the value of filled pix-
els should be.
The algorithm is as follows:while a pixel equals zero, that pixel is
replaced by the value of the FCI bit. Initially for normal fills, this is set
to zero, ensuring that the initial fill leaves blank space around the bound-
ary. Once a set pixel is encountered, it inverts the FCI bit, and then uses
the ECE/ICE bits to determine what happens next. If the ICE bit (inclusive
carry enable) bit is set, the pixel is set to the new value of FCI AFTER the
inversion. If the ECE (exclusive carry enable) bit is set, the pixel is set
to the value of the FCI bit BEFORE the inversion. With ECE set, it is possi-
ble to obtain filled polygons with single-pixel corners, whereas corners of
filled polygons using ICE will always have at least two pixels at any corner
formed by the intersection of two boundary lines forming a peak (such as the
apex of a triangle). This makes most sense when I mention that the blitter
fills horizontally from right to left, a word at a time, until it has exhaus-
ted the row, and then moves on to the next row. Note also that if during the
fill of one raster line of pixels, it fails to encounter a boundary pixel,
the blitter will continue the fill on the next raster line until encounter-
ing a boundary pixel, resulting in weird and wonderful (but not always desi-
rable!) effects.
Furthermore, the algorithm works only in DESCENDING mode, so that
the blitter's DESC bit must be set.
So, to fill a boundary, the procedure is as follows:draw the poly-
gon boundary using the SING mode of the blitter's line-drawing function, so
that any lines with a slope of less than 45 degrees have only one pixel per
horizontal raster line. Having drawn the closed boundary in a suitable mem-
ory buffer, point the BLTxPTH/L registers at the END of the memory buffer,
and activate the blitter fill. This is fast-16 million pixels per second peak
speed.
To illustrate the procedure best of all, I present a piece of code
that performs this function. This code can be freely ripped out of this DOC
file and mutilated ad lib to suit the programmer's personal prejudices (note
the use of alliteration! I passed my English Language O-Level! HAH! So what
I hear you say...)
This code uses the line drawing code above to draw the boundary in
a memory buffer, then fills the buffer before transferring the contents of
the buffer to the screen. It uses a data structure, which I also include in
this section, to manage the polygon. Note that I refer to something called
the Laurence trick for managing the blitter. This refers to a trick devised
by a colleague of mine, a Belgian programmer called Laurence Vanhelsuwe who
first used the trick of setting the modulo to -2 for blitter data block move-
ments. I refer to it frequently in my blitter routines.
* polygon definition structure. See draw_polygon() routine
* for more info.
rsreset
poly_screen rs.l 1 ;1st bitplane to draw polygon on
poly_ssize rs.w 1 ;bitplane size
poly_smod rs.w 1 ;bitplane width in bytes
poly_buffer rs.l 1 ;where to draw polygon before rendering
;on the real screen...
poly_wide rs.w 1 ;width of buffer in words
poly_tall rs.w 1 ;height in raster lines
poly_border rs.l 1 ;pointer to line definition structure
poly_xpos rs.w 1 ;position to plot finished polygon
poly_ypos rs.w 1 ;on the screen
poly_flag rs.b 1 ;colours
poly_pad rs.b 1
poly_sizeof rs.w 0
* poly_flag : bits 0-3 = colour
* poly_border : points to line definition structure. This in turn has
* a pointer to a line coord list, each entry of which should have the
* SING bit set in lc_bits.
;draw_polygon(a4,a5)
;a4 = ptr to polygon structure definition block
;a5 = ptr to custom chip registers
;creates polygon using blitter line draw in SING mode into a buffer
;followed by blitter fill. Then transfers the whole lot to the screen
;at the desired polygon coordinates. I.e. pre-plots in buffer. Only
;pre-plots in one bitplane, for max efficiency, then moves the entire
;lot over to the actual screen, copying only to those bitplanes
;required. Obviously, if plotting in colour 9 over a background with
;pixel data in colours 2 & 6, these will show up as colour 11 & 15
;pixel data (for a 4-bitplane screen).
;note:for optimum efficiency of memory usage, define your polygon
;with one corner at (0,0) or as close as possible to it.
;d0-d7/a0-a3 corrupt
draw_polygon move.l poly_border(a4),a3
move.l line_coords(a3),d0 ;get coord list
beq draw_poly_done ;doesn't exist-BYE!
move.l d0,a3
moveq #0,d0 ;potential x & y coord
moveq #0,d1 ;maxima
draw_poly_l1 move.w lc_x1(a3),d2
move.w lc_y1(a3),d3
cmp.w d2,d0
bge.s draw_poly_b1
move.w d2,d0 ;new x maximum
draw_poly_b1 cmp.w d3,d1
bge.s draw_poly_b2
move.w d3,d1 ;new y maximum
draw_poly_b2 move.w lc_x2(a3),d2
move.w lc_y2(a3),d3
cmp.w d2,d0
bge.s draw_poly_b3
move.w d2,d0 ;new x maximum
draw_poly_b3 cmp.w d3,d1
bge.s draw_poly_b4
move.w d3,d1 ;new y maximum
draw_poly_b4 move.l lc_next(a3),a3 ;find next set of coords
move.l a3,d2 ;check if they exist
bne.s draw_poly_l1 ;yes, back for more testing
;this for debug only
move.w d0,debug
move.w d1,debug+2
;Use max x&y coord values to determine size of buffer to clear
;using the blitter.
move.w d0,d2
lsr.w #4,d0 ;int(max_x/16) = word count
and.w #%1111,d2 ;check fraction
beq.s draw_poly_b5
addq.w #1,d0 ;1 word more
draw_poly_b5 move.w d0,poly_wide(a4) ;WORD count!
move.w d1,poly_tall(a4) ;height in raster lines
;prepare to clear buffer in which polygon is to be drawn.
;Clear using blitter - it's most efficient!
move.w d1,d6
lsl.w #6,d6
add.w d0,d6 ;this is BLTSIZE
move.w d6,debug+4
clr.w debug+6
move.w #$01F0,BLTCON0(a5) ;USED only
;minterms $F0
clr.w BLTCON1(a5) ;normal mode, no shift
move.l poly_buffer(a4),BLTDPTH(a5) ;ptr to buffer
;to clear
clr.w BLTDMOD(a5) ;D modulo zero
clr.w BLTADAT(a5) ;A data zero for clear
moveq #-1,d2
move.l d2,BLTAFWM(a5) ;ensure masks allow data passage
move.w d6,BLTSIZE(a5) ;and clear buffer!
move.l poly_border(a4),a3 ;ptr to coord
;struct for border
add.w d0,d0 ;WORD count to BYTE count
move.w d0,line_smod(a3)
mulu d0,d1 ;size of buffer in bytes
move.w d1,line_ssize(a3) ;won't be a long really!
move.l poly_buffer(a4),d2
move.l d2,line_screen(a3)
move.l a4,-(sp)
move.l a3,a4 ;point to line structure
bsr drawline ;& draw the lines
move.l (sp)+,a4 ;recover pointer
;now activate blitter fill. Note : that works only if descending mode
;selected. This code does that. This is an exclusive fill enable type
;fill, with FCI initially zero (non-inverting fill).
draw_poly_b6 btst #6,DMACONR(a5) ;wait till blitter done
bne.s draw_poly_b6 ;busy wait (sigh)
move.l poly_buffer(a4),a0 ;where border is
move.w poly_wide(a4),d1 ;width in words
move.w d1,d2
move.w poly_tall(a4),d3
mulu d3,d1 ;area size in words
add.l d1,d1 ;area size in bytes
add.l d1,a0 ;descending mode-
;adjust pointer
subq.l #2,a0 ;point to last word
;of data proper!
move.l a0,BLTDPTH(a5)
move.l a0,BLTAPTH(a5) ;two pointers
clr.w BLTAMOD(a5)
clr.w BLTDMOD(a5) ;both moduli zero!
moveq #-1,d0
move.l d0,BLTAFWM(a5) ;ensure masks OK
move.w #$09f0,BLTCON0(a5) ;USEA/D, minterms $F0
; move.w #$0012,BLTCON1(a5) ;EFE on, FCI=0, DESC=1
move.w #$000A,BLTCON1(a5) ;IFE on, FCI=0, DESC=1
move.w d3,d0 ;height
lsl.w #6,d0 ;*64
add.w d2,d0 ;add on width in words
move.w d0,BLTSIZE(a5) ;start blitter
;NB : when computing pointers for data blocks in above section,
;use subq.l #2,a0 to point to last words proper, instead of beyond
;the data blocks, otherwise the fill gets confused! Weird things
;happen if you don't do this!
move.l poly_screen(a4),a0 ;screen pointer
move.l poly_buffer(a4),a2 ;ptr to polygon buffer
moveq #0,d0
move.w #BP_BYTES,d5 ;no of bytes per screen bitplane
move.w poly_ypos(a4),d0
mulu poly_smod(a4),d0 ;y coordinate * screen modulo
add.l d0,a0
move.w poly_xpos(a4),d0
move.w d0,d4 ;save x coordinate
lsr.w #4,d0 ;2*int(x/16)
add.w d0,d0
add.w d0,a0 ;now this is initial pointer
move.w poly_smod(a4),d2 ;screen modulo
move.w poly_wide(a4),d3
addq.w #1,d3 ;word cols + 1:Laurence trick
add.w d3,d3 ;Part 2
sub.w d3,d2
move.w d2,d0 ;C,D mods in d0
swap d0
move.w #-2,d0 ;A,B mods also
moveq #-1,d1 ;blitter mask values:Laurence
clr.w d1 ;Trick Part 3
move.w d4,d2 ;get x coordinate
and.w #%1111,d2 ;frac(x/16)
ror.w #4,d2 ;put in top 4 bits for BLTCONx
move.w d2,d3
or.w #$0FCA,d2 ;USEA/B/C/D, minterms $CA
swap d2
move.w d3,d2 ;create BLTCONx bits
moveq #0,d3
move.w poly_tall(a4),d3
and.w #$3FF,d3
lsl.w #6,d3
move.w poly_wide(a4),d6
addq.w #1,d6
and.w #$3f,d6
add.w d6,d3 ;this is BLTSIZE!!
moveq #3,d7 ;no of bitplanes
;here transfer polygon to screen bitplanes according to colour
;specifier. a0 = ptr to screen location, d0 = moduli (C,D high
;word, A,B low word), d1 = blitter mask values, d2 = BLTCON0
;and BLTCON1 control words, d5 = size of 1 screen bitplane in
;bytes, d3 = BLTSIZE value pre-calculated (will stay the same
;size throughout the operation) and a2 = ptr to polygon buffer.
;So leave d0-d5/a0/a2 alone while within the loop!. Leave a4 alone
;anyway or the routine will crash! Freely alter d6/a1/a3. This
;info in case you have any flash refinements to make. Note one of my
;favourite tricks-using SWAP & sticking counters/other data in both
;words of 1 reg.
;Note that the choice of sources A and B dosen't matter here because
;they're both the same! If you use different sources, remember to
;make Source A the MASK, Source B the DATA, Source C the BACKGROUND!
draw_poly_l2 swap d7 ;bitplane bit no
move.w d7,d6 ;copy
addq.w #1,d7 ;next bitplane no
swap d7 ;back to counter
btst d6,poly_flag(a4) ;this bitplane?
beq draw_poly_b8
draw_poly_b7 btst #6,DMACONR(a5) ;wait till blitter done
bne.s draw_poly_b7 ;busy wait (sigh)
move.l a0,BLTDPTH(a5) ;ptr to screen area to plot to
move.l a0,BLTCPTH(a5)
move.l a2,BLTAPTH(a5) ;ptr to polygon buffer
move.l a2,BLTBPTH(a5)
move.w d0,BLTAMOD(a5) ;A,B moduli -2:Laurence trick
move.w d0,BLTBMOD(a5) ;Part 1
swap d0 ;get C,D moduli
move.w d0,BLTDMOD(a5) ;C,D moduli
move.w d0,BLTCMOD(a5)
swap d0 ;recover A,B moduli again
move.l d1,BLTAFWM(a5) ;blitter masks (see above)
move.l d2,BLTCON0(a5)
move.w d3,BLTSIZE(a5) ;start blitter
draw_poly_b8 add.w d5,a0 ;next bitplane
dbra d7,draw_poly_l2 ;continue
draw_poly_done rts
Hardware:Sound Management
For those who like the Jean-Michel Jarre sound effects that the Amiga is able
to reproduce (guess who's bought all of his albums!) and wish to reproduce a
similar gee-whizz set of sound effects, this is the section for you.
However, I warn anyone who hasn't studied maths to a decent level of
the horrors to come. Sound synthesis using the additive method of the Paula
chip is best understood by those who know something about Fourier analysis! I
shall try to make this as painless as possible.
All sound waves, plotted graphically, are formed from lots of sine
(and/or cosine) curves added together. The fun part about adding sine and co-
sine functions together is that you get a more complex waveform as a result.
This is known by the lofty title of the principle of superposition of wave-
forms. So any waveform can be broken down into components of the form
A * sin ((n * x) + p))
where A is the amplitude of the component (which equates approximately to the
volume), n is a constant multiplier from 1 onwards, and p is the phase angle.
The phase angle describes how far along the x-axis the curve is shifted. As
it happens to be true that cos(x) = sin(x+w) where w=90 degrees or pi/2 rad-
ians, all components of a sound wave can be represented as above.
As an exercise, to see this in action, try plotting a graph of the
function
sin(x) - (1/3)sin(3x) + (1/5)sin(5x) - ...
for as many terms as you can bother to calculate. You'll find the waveform is
interesting.
This leads directly on to some musical definitions, and their rela-
ted mathematical definitions. The amplitude of the curve, namely the distance
between the highest peak and the lowest trough of the curve, defines the vol-
ume of the sound. The pitch is related to the frequency, which in turn is in
direct relation to the constant n in the first expression above. Where one of
the components of the curve has a large constant multiplier in front of it,
i.e., the value of A is large, that component is the dominant one and defines
the pitch of the note. This is the principal note. The lesser components are
known as the harmonics, especially when they are in some numeric relation to
the principal note. In the second expression above, the second and third com-
ponents define harmonics for the note, the principal is the term sin(x), and
the amplitude of the whole is the distance between the peaks & troughs, which
for this waveform should be 2*(1-1/3+1/5-...). Anyone now worrying about the
amplitude of this being infinite, because of it being an infinite series, be
reassured. The series may be infinite, but for good mathematical reasons it
ends up as a finite value! Trust me-I did this at university!
Anyone wishing to construct a waveform using this idea, be warned.
The ultimate result of embarking on this course is to immerse oneself in the
varagries of Fourier analysis, so called because Joseph Fourier, the French
mahtematician, first published a treatise on the subject (which he became in-
terested in precisely because he wished to analyse music mathematically). If
you carry this through, you'll find yourself immersed in masses of trigono-
metrical integration and orthogonality computations, and if you don't know
what that means, you're best avoiding this method!
Fortunately, there are other ways of creating your waveform. One can
simply draw a nice-looking graph of what looks like a waveform, and instead
of working out the x,y values the hard way as above, simply read them off the
graph. It is these values that Paula uses!
Now, it is time to discuss other effects. The frequency of the wave
need not be constant. Varying the frequency rapidly about the principal note
by a small amount creates vibrato. Slowing the rate of variation gives rise
to tremolo. The shape of the curve above defines a quality called timbre, and
this is the reason that so many musical instruments sound differently even if
they all play the same note. Their sounds, when taken via a microphone, sent
through an amplifier and displayed on an oscilloscope screen, give rise to a
whole variety of different curve shapes. It was these shapes that led Fourier
to perform his mathematics which describes them. And yes, Paula can reproduce
all of these! By way of example, a square wave has the equation
y = sin(x) + (1/3)sin(3x) + (1/5)sin(5x) + ...
which is slightly different to the first example. A sawtooth wave has the
equation
y = sin(x) + (1/2)sin(2x) + (1/3)sin(3x) + ...
and for your education, anyone with a Yamaha DX7 synthesiser will know that
this instrument builds its sounds in precisely this way (which is why it is
so hard to play one!).
Noise, as opposed to music, is defined as randomly superposed fre-
quencies. White noise, as it is called, is a mathematical idealisation that
is impossible to achieve in practice, and consists of all possible frequen-
cies superposed one upon another. Pink noise is a more limited selection of
such frequencies played simultaneously, and good pink noise generators are
capable of generating all audible frequencies simultaneously, thus provid-
ing a good approximation to white noise. Noise curves, when plotted, look
like the paths described by spiders walking after immersion in vodka. Anyone
who has seen sound traces on oscilloscopes will know what I mean.
Having finished the preamble, it is now time to consider how to im-
plement this on Paula. First, let us look at the register list for Paula:
Offset Name Function
------ ---- --------
ADKCON 09E Audio & Disc Controller
(Cross-reference #5)
AUD0LCH 0A0 High word, audio data address, channel 0
AUD0LCL 0A2 Low word, audio data address, channel 0
AUD0LEN 0A4 Data length, channel 0
AUD0PER 0A6 Period duration, channel 0
AUD0VOL 0A8 Volume, channel 0
AUD0DAT 0AA Audio data, channel 0 (to D/A converter)
AUD1LCH 0B0 High word, audio data address, channel 0
AUD1LCL 0B2 Low word, audio data address, channel 0
AUD1LEN 0B4 Data length, channel 0
AUD1PER 0B6 Period duration, channel 0
AUD1VOL 0B8 Volume, channel 0
AUD1DAT 0BA Audio data, channel 0 (to D/A converter)
AUD2LCH 0C0 High word, audio data address, channel 0
AUD2LCL 0C2 Low word, audio data address, channel 0
AUD2LEN 0C4 Data length, channel 0
AUD2PER 0C6 Period duration, channel 0
AUD2VOL 0C8 Volume, channel 0
AUD2DAT 0CA Audio data, channel 0 (to D/A converter)
AUD3LCH 0D0 High word, audio data address, channel 0
AUD3LCL 0D2 Low word, audio data address, channel 0
AUD3LEN 0D4 Data length, channel 0
AUD3PER 0D6 Period duration, channel 0
AUD3VOL 0D8 Volume, channel 0
AUD3DAT 0DA Audio data, channel 0 (to D/A converter)
Complications arise because ADKCON controls the disc and UART as well as the
sound output. However, ADKCON works on the SETIT principle in an identical
fashion to DMACON and other like control registers, so that the programmer
can choose to set only those bits pertinent to the sound system. Only the low
8 bits are used for sound, and bit 15 acts as the SETIT bit (#5) as for DMA-
CON. The bit assignments for sound are:
Bit Name Function
--- ---- --------
15 SETIT see DMACON
7 USE3PN Audio channel 3 modulates nothing
6 USE2P3 Channel 2 modulates period of channel 3
5 USE1P2 Channel 1 modulates period of channel 2
4 USE0P1 Channel 0 modulates period of channel 1
3 USE3VN Channel 3 modulates nothing
2 USE2V3 Channel 2 modulates volume of channel 3
1 USE1V2 Channel 1 modulates volume of channel 2
0 USE0V1 Channel 0 modulates volume of channel 1
These usages will be explained later.
There exist two possible ways of creating a sound. First, one could
draw a graph of one wave of the waveform, digitise it (i.e., read off the y
values at equally spaced x intervals, and rescale these y values to that they
lie within the range -127 to +127) and put the data into CHIP RAM. Then, set
Paula up to play the waveform, and write a piece of interrupt code to respond
to an audio interrupt by replaying the waveform repeatedly. This limits the
note to one waveform, but the pitch can be varied within the interrupt code.
The second method is to digitise an ENTIRE sound into CHIP RAM, and
set Paula up to play the whole lot in one go. This is obviously a task that
is not suited to hand-calculation, and to assist with this, special digitiser
hardware exists that can be interfaced to the Amiga to perform this function,
this hardware being known as a SAMPLER. By using a sampler, an entire single
record can, in theory, be digitised and played on the Amiga (for an example
of what is capable, listen to the Xenon 2 Megablast soundtrack by Bomb The
Bass!).
Problems occur because of the use of 8-bit data resolution within
the sound system. A certain quality can be achieved, but for comparison, a
compact disc player uses 16 bits for hi-fi quality. So anyone toying with the
idea of reproducing studio quality classical music on the Amiga directly will
possibly be disappointed-there may be a noticeable change in quality. Because
all analogue to digital conversion involves sampling errors, due to the roun-
ding effect of A/D conversion, a quantity called the quantization error res-
ults. For an 8-bit system, the maximum possible quantization error is 1/256
of the digitised value. The magnitude of the quantization error is directly
proportional to a noise value called the quantization noise, which may have a
deleterious effect upon sound quality (hence the use of 16 bits by CD players
which reduces the quantization error to 1/65536 of the digitised value).
Now, the first quantity that requires calculation is the sampling
rate. This is the rate at which the sampler 'snapshots' the analogue sound
data and converts it into digital data. If the data is not sent to Paula at
the same rate, then the sound will be shifted in frequency with the possi-
bility of distortion occurring (but of course, there is no reason why this
cannot be done deliberately for special effects!).
If using one waveform, the number of samples per waveform is equal
to the number of y values you have decided to use. To make sine curves con-
form more closely to the ideal, increase the number of samples.
A simple example illustrates this. Let us reproduce a pure tone, as
represented by a pure sine wave. One complete wave consists of the curve from
0 degrees to 360 degrees. Let us split this up into 16 samples. So we require
the values of sin(x) from 0 to 360 degrees in 16 steps, scaled so that the
end results lie within the range -127 to +127. Because sin(x) varies from -1
to +1, we therefore want the values of 127*sin(x) over this range. The values
thus required are:
0
48.6007
89.8025
117.3327
127
117.3327
89.8025
48.6007
0
-48.6007
-89.8025
-117.3327
-127
-117.3327
-89.8025
-48.6007
Rounding these to the nearest whole number (which introduces the quantization
error unfortunately, and cannot be avoided because Paula can't handle float-
ing point numbers!) and putting this list of bytes into CHIP RAM, yields the
desired sample data. Since Paula uses standard twos-complement bytes, and it
is usual for assemblers to generate these directly from negative numbers, we
can insert these values directly into a DC.B list, e.g.;
sine_wave dc.b 0,49,90,117
dc.b 127,117,90,49
dc.b 0,-49,-90,-117
dc.b -127,-117,-90,-49
Now, we point Paula's data registers at the start of the list. This
is usually done using something like
LEA $DFF000,A5
LEA sine_wave(PC),A0
MOVE.L A0,AUD0LCH(A5)
Now we tell Paula how long the sample list is in WORDS. There are 16 actual
data bytes. The value to write to AUD0LEN is thus 16/2 = 8. For lists with
an odd number of bytes, e.g., 17 instead of 16, use 17/2 rounded DOWN to the
nearest integer, i.e., 8 again. Use the instruction:
MOVE.W #8,AUD0LEN(A5)
to set the list length. For longer lists, use two labels, one at the start of
the list, one at the end of the list, and use something like
MOVE.W #(end-start)/2,AUD0LEN(A5)
(as in the Amiga Systems Programmer's Guide).
Now we select the output volume. Since this is going to be a cons-
tant value, and we can choose any value from 0 to 64, let us choose half of
full volume:
MOVE.W #32,AUD0VOL(A5)
Now we need to choose our sampling rate. In this case, the sampling rate will
affect the frequency of the sound. So we can use the relation
F = S/N
where F is the frequency, S is the sampling rate, and N is the number of sam-
ples per cycle. In this case, N=16. Now we cannot specify the frequency as a
specific number of Hertz, but need to relate it to the number of bus cycles.
A bus cycle is 279.365 nanoseconds. We need to compute the sample period as
given by the relation
P = 1/(S * 2.79365 * 10E-7)
where I use 10E-7 to represent 10 to the power of -7. To play our sine wave
at International A, or 440 Hz, we need to transform our frequency relation,
and yield
S = F * N
Using F=440, N=16, we have S = 7040 Hz. Now this will give P as the value
1/(7040 * 2.79365 * 10E-7)
or 508.4583. Rounding, this yields 508. We therefore set up Paula using
MOVE.W #508,AUD0PER(A5)
to generate the appropriate International A note. The resulting note will in
actuality deviate by 0.4 Hz from the true value, but only a listener posses-
sing perfect pitch (particularly a classical musician) will notice the diff-
erence.
There exists some limit upon the AUDxPER values. Each audio channel
has one DMA slot per raster line, and hence one word, or two samples, can be
read in one raster line. Thus the smallest possible value for the sampling
period is 114 (NOT 124 as given in the Amiga Systems Programmer's Gude!). The
value is obtained as follows:one raster line equals 63.5 microseconds. So one
second contains 1/(63.5 * 10E-6) raster lines, which rounded up equals 15748
raster lines. Two samples can be read per raster line, or 31496 samples when
read at maximum speed. This yields P = 113.65, or rounded up, 114. Selecting
a value lower than 114 will result in some data words being output twice, as
the DMA system cannot fetch the next data word on time. This gives a sampling
rate of 1/(114 * 2.79356 * 10E-7) or 31399 Hz. Obviously, one can sample at a
slower rate, and even sample using P=65535 for weird effects!
Having set up the registers, we need to activate the audio DMA. The
instruction
MOVE.W #$8201,DMACON(A5)
performs this (it sets AUD0EN and DMAEN just to be sure).
At this point, Paula will play the sound. The example above will be
heard as a single note playing continuously, and eventually you will become
so sick of hearing it you'll turn the volume off on your monitor! This exam-
ple sounds like the horrible 'beep' accompanying the BBC test card...
Now for some details. Paula has internal registers for data fetching
unlike the blitter (why the hell couldn't the blitter have them?), and when
Paula is started up, the AUDxLCH/L register pair contents are copied to these
internal registers before data fetching. Thus the initial values of AUDxLCH/L
are preserved. The same is true for AUDxLEN, AUDxVOL and AUDxPER (otherwise
you wouldn't hear the same 'beep' all the time when playing the example sound
above!). Only when the audio DMA is turned off will the sound stop!
Now since the AUDxCLH/L values, etc, are copied to internal regis-
ters while playing, the processor can supply new values to change the note on
a continual basis. Uninterrupted sound generation can thus continue.
Paula generates an interrupt at the end of each complete sound out-
put cycle. By enabling the audio interrupt bits, and linking in an interrupt
handler to the audio interrupt vector (IPL4) it is possible to change which
note is played under interrupt control. Be warned that for high frequencies
the interrupt occurs VERY OFTEN and if your interrupt code cannot respond to
it fast enough, the supervisor stack will overflow with unsatisfied inter-
rupt requests and crash the machine!
As for all interrupt code, the interrupt handler must clear the
INTREQ bit for the appropriate channel after checking which channel caused
the interrupt to occur, and then perform whatever function the user desires,
be it changing the frequency of a note such as played in the above example,
or performing more complex changes such as switching between waveforms.
Modulation:the term modulation is used to describe the process by
which one waveform can be used to change the output from another. Normally
two quantities are subject to modulation, volume and frequency.
When a waveform A is used to alter the volume at which waveform B is
output, waveform A is described as a VOLUME ENVELOPE. When the waveform A is
used to alter the pitch of the notes played, waveform A is then described as
a TONE ENVELOPE.
Volume envelopes are the simplest to describe. With a volume envel-
ope, it is possible to generate full ADSR synthesised sound. ADSR stands for
Attack, Decay, Sustain, Release, and describes four steps in the generation
of a sound. Attack describes how a sound builds up in volume from zero to a
maximum. Sustain describes the section where the volume is maintained at a
constant volume for a specific time period. Decay and Release both describe
how sound volume decreases, the Decay phase(s) following Attack/Sustain pha-
ses, and the Release phase occurring at the end of the note.
When using a volume envelope, the data for the volume envelope is
entered as for normal sample data. A second area of memory is used for the
sample data itself. For example, let channel 0 modulate channel 1. For this
process, AUD0VOL is turned off (channel 0 is not used for actual sound gene-
ration), AUD0LCH/L is pointed to the envelope data, and AUD0PER can be set
either to the same rate as AUD1PER for the actual sound data, or to a diff-
erent value, according to the effect desired. AUD1LCH/L is pointed to the
sample data, AUD1VOL is set to the maximum value (64), and AUD1PER to the
sampling rate pertinent for the sample data. Lastly, set USE0V1 in ADKCON and
set Paula running.
A frequency envelope is defined in a different way. Instead of defi-
ning it as a list of sample data bytes, it is defined as a list of sampling
rate words as written to the AUDxPER registers. If the base note of the sound
has a principal frequency of 440 Hz as in the example above, and thus has a
sampling period value of 508, the values in the frequency envelope can vary
around this value to create vibrato/tremolo effects, or alternatively, vary
in one direction to produce ascending/descending pitch. The larger the value
used, the lower the sampling rate used and the lower the frequency resulting
when the sample is played. Depending upon the rate of sampling used for the
frequency envelope, either vibrato or tremolo can be produced.
Note that it may be possible to set both USE0V1 and USE0P1 bits. If
this is done, the data for channel 0 will change BOTH the volume and the fre-
quency of the output from channel 1, but since there is a fundamental differ-
ence between amplitude and volume envelope data, the effects cannot readily
be predicted beforehand.
Sound generation problems:in the discourse above, it was mentioned
that the maximum sampling rate possible was 31399 Hz (ignore the value used
in the Amiga System Programmer's Guide-it is WRONG). For a digitised sine
wave of 16 sample data points, the maximum frequency possible for this sine
wave is 1962.4 Hz. To generate higher frequencies, the number of samples must
be reduced, so that, for example, a sine curve with 8 samples will have a
maximum possible reproduction frequency of 3924.8 Hz. But as the number of
samples decreases, the quality of the sine wave also decreases, until in the
limiting case, the result is an annoying buzzing noise.
One way of circumventing the problem is to digitise multiple waves
in the waveform instead of just one. This allows higher-frequency waveforms
to be generated without the problems of waveform degeneration. But a second
problem occurs with the higher frequencies. There exists a phenomenon assoc-
iated with high-frequency sound generation called aliasing distortion. This
comes in two forms, one as the sum of the sampling rate and the desired fre-
quency, and the second as the difference. So, for example, if the sound is of
3KHz frequency, and the sampling rate is 12KHz, the aliasing distortion will
occur at 9KHz and 15KHz.
To eliminate this aliasing distortion, Paula contains a device known
as a low-pass filter. This has been placed between the output of the D/A con-
verters and the audio connectors in Paula, and its effect is to allow all of
the frequencies below 4KHz to pass undisturbed. Frequencies between 4KHz and
7KHz are diminished in amplitude, and frequencies above 7KHz are not passed
at all.
In the example cited above, the 3KHz main signal is allowed to pass
to the speaker undisturbed, but both of the aliasing distortion frequencies
lie above the 7KHz cutoff point and are eliminated. Should the sampling rate
be reduced to 9KHz, however, the aliasing distortions now occur at 12KHz and
6KHz, and the 6KHz aliasing distortion, although diminished, is still allowed
past to the speaker.
So, when determining sampling rates, to ensure destruction of the
aliasing distortion and passage of the desired sound signal, the sampling
rate must be chosen so that it is greater than the frequency of the highest
frequency component of the sound PLUS the 7KHz cutoff point. So for a sound
with a maximum frequency component of 4KHz, the sampling rate must be greater
than 4KHz+7KHz, or 11KHz. Needless to say, given the upper limit of 31KHz on
the sampling rate as determined earlier, the highest possible frequency com-
ponent of ANY sound can be no higher than 24KHz. Many people lack the ability
to hear sounds above 16KHz, but some exceptional persons can hear up to 24KHz
before the sound exceeds their audible range, including one or two classical
musicians of note. Most music does not contain notes above 3KHz, so very high
sound frequencies are generally for specialist applications.
Fundamental notes are not the only consideration, however. To main-
tain the quality of the music, the timbre must be maintained. This is direct-
ly related to the shape of the curve of the waveform, which in turn depends
upon maintaining the existence of higher frequency harmonics. Harmonics are
often separated by octaves, and an increase of one octave results in the fre-
quency being doubled (octaves and frequencies are related logarithmically).
So a 3KHz note with several harmonics may have most or all of its harmonics
destroyed by the low-pass filter. In the extreme case, a square wave which
consists theoretically of an infinite number of harmonics may be reduced to
a sine wave by the low-pass filter.
Of course, digitising whole sections of music as opposed to single
notes solves this problem, but at the expense of memory. If the sampling rate
used by a hardware sampler is 20KHz, and one second of music is sampled, the
data takes up 20K of memory. This limits the maximum amount of sound that can
be digitised in one go at that sampling rate to 25.6 seconds (assuming that
you can fill all of CHIP RAM with the data!). For lower sampling rates more
data can be digitised, but the memory problem still remains. One solution is
to digitise sections at a time, compress the resulting data using a suitable
data compression a;lgorithm, then decompress them into CHIP RAM and play them
in sequence. Needless to say, the decompression algorithm needs to be fast,
and synchronised to the sound playing. The data must not be decompressed into
an area of memory already being used for sound reproduction, hence the decom-
pression of data MUST NOT catch up with the sound generation! Grabbing the
data from disc is another possible way around the problem-directly accessing
the disc tracks yields a data transfer rate of 62500 bytes per second peak,
and can hence be used to lob in data piecemeal, provided that there is suffi-
cient disc space. Again, compression/decompression may be needed.
Hardware:Disc Management
Disc drives on the Amiga are controlled from two sources, the CIAs and Paula.
The CIAs govern such features as motor control, drive selection (#6) and head
movement. The FLAG pin of CIA-B is connected to generate the /INDEX signal of
the disc drive.
The ADKCON register mentioned above in the sound management section
is also used to control the disc controller. The bits 14-8 of this register
are used for disc control, and the bit assignments are:
Bit Name Function
--- ---- --------
15 SETIT see DMACON
14,13 PRECOMP Sets precompensation
12 MFMPREC 0=GCR encoding, 1=MFM encoding
11 UARTBRK used for the UART, not used here
10 WORDSYNC 1=enable synchronisation
9 MSBSYNC 1=enable GCR synchronisation
8 FAST 0=4 microseconds/bit(GCR)
1=2 microseconds/bit(MFM)
By appropriate programming of this register, it is possible to enable Amiga
disc drives to read either MFM or GCR encoded discs. For explanation of the
terms MFM and GCR, see a comprehensive data source on disc drive operation,
as I have temporarily forgotten the relevant information.
Both encoding mechanisms require appropriate software to generate
the encoded data before writing to disc, and to decode the data read from the
disc. The appropriate routines for data encoding/decoding for MFM form part
of the trackdisk.device - the Amiga uses MFM encoding for its discs as the
default choice.
The need for such encoding is explained simply - raw data cannot be
written as is to disc. Because of limitations imposed by, among other things,
the laws of electromagnetism, the data has to be encoded in a manner allowing
the data to be securely stored in the form of magnetic flux changes on the
disc surface. I suggest searching out a concise reference work on the subject
before attempting one's own MFM/GCR encoding/decoding software!
As to the existence of these two systems, they have their own hist-
ory. However, GCR coding is used for Apple Macintosh discs, and there exists
a Macintosh emulator for the Amiga called A-MAX II using Paula's GCR encoding
ability to read Mac discs directly.
Precompensation is a little bit more difficult to explain. When the
disc system writes data to the disc, the data is written as a series of flux
changes onto the magnetic medium. The time for each flux change to occur is
called the half-zero-bit length, and the gap synthetically introduced to in-
crease data security for high-speed transfers is the precompensation. Normal-
ly the faster the data transfer, the higher the precompensation needed to av-
oid data read/write errors. Four possible settings are provided for by Paula.
Now we have the ability to set up the disc controller to provide the
encoding system, precompensation and disc controller clock rate. Now we need
to tell Paula where the disc transfer buffer lies. This involves the regis-
ters DKSPTH (offset $020) and DKSPTL (offset $022). These two registers pro-
vide a pointer into a buffer in CHIP RAM. This buffer can be used for either
read or write operations.
Having informed Paula of the disc transfer buffer location, we need
to inform Paula of 1) the length of data to transfer, and 2) the data direc-
tion (read/write). The register DSKLEN (offset $024) performs these functions
in one go. The bit assignments of DSKLEN are:
Bit Name Function
--- ---- --------
15 DMAEN Enable Disc DMA
14 WRITE 0=read data from disc, 1=write data to disc
13-0 LENGTH 14-bit number : no of WORDS to transfer
When DMAEN is set to 1, the data transfer is theoretically enabled. The use
of the word 'theoretically' is deliberate, because Paula contains a mechanism
to prevent accidental disc writes. Firstly, the DSKEN bit in DMACON must be
set, and even if it is set, the DMAEN bit of DSKLEN (CARE! confusion may ar-
ise in the names here!) must be set TWICE before the operation is executed.
In addition, the WRITE bit must only be set for a genuine write operation! If
you try changing the value of this bit during the second setting of WRITE, a
weird and wonderful sequence of events may just occur, leading among other
things to Paula shuffling off her mortal coil...
The orderly sequence for DSKLEN is as follows:
MOVE.W #0,DKSLEN(A5) ;turn off disc
MOVE.W #$8210,DMACON(A5) ;enable disc DMA
;just in case
LEA disbuf(PC),A0
MOVE.L A0,DSKPTH(A5) ;set up disc buffer
CLR.W D0
BSET #15,D0 ;set DMAEN
BSET #14,D0 ;set WRITE if wanted
MOVE.W #LENGTH,D1 ;amount of data
;to transfer
ADD.W D0,D1
MOVE.W D1,DSKLEN(A5) ;set up disc
MOVE.W D1,DSKLEN(A5) ;now execute!
... ;here wait until the
;disc DMA is finished...
MOVE.W #0,DSKLEN(A5) ;and shut off when done.
The DSKBLK interrupt is provided in the INTREQ/INTENA registers so that the
processor can discover when the disc controller has finished. When the number
of words specified in DSKLEN has been transferred in whichever direction has
been chosen, the DSKBLK interrupt is signalled. It is generated when the last
word of data is transferred.
To examine the current status of the disc controller, there exists
the DSKBYTR (offset $01A) register, which is assigned as follows:
Bit Name Function
--- ---- --------
15 BYTEREADY Signals that byte in lower 8 bits is
valid
14 DMAON Indicates if disc DMA is active. DMAON
set to 1 when DMAEN of DSKLEN is 1 AND
DSKEN of DMACON is 1 (CARE WITH NAMES!)
13 DSKWRITE Copy of WRITE in DSKLEN
12 WORDEQUAL Disc data equals DSKSYNC
11-8 Unused
7-0 DATA Current data byte from disc
Incidentally, it is possible to use this data register to read the data from
the disc using the 68000 intsead of using DMA (should you want to!). Whenever
a complete byte is received from the disc, the disc controller sets the BYTE-
READY bit. The processor then knows that the data in the lower 8 bits is a
valid data byte. After DSKBYTR is read, the BYTEREADY flag is automatically
reset. The DMA <system normally performs this without intervention from the
68000.
Sometimes, instead of reading an entire track of data at once, the
programmer may wish to read data starting at a specific position. The prog-
rammer uses the DSKSYNC register (offset $07E) to determine where the disc
data transfer will begin. The value is an offset, indicating which data word
the transfer is to begin at (for normal whole-track transfers, the value of
DSKSYNC is zero). The disc controller maintains a count of words transferred,
and when that count is less than the value in DSKSYNC, the data is read by
the DMA system but NOT transferred. When the internal count of words read is
greater than or equal to the value of DSKSYNC, the data being read is duly
transferred. Thus the disc controller can be programmed to wait for the syn-
chronisation mark at the start of a data block.
Two other registers exist. These are DSKDAT (offset $026) which is
used to contain the data written to the disc by the DMA controller, and the
DSKDATR register (offset $008) which contains data read from the disc. DSK-
DATR is an early-read register, NOT accessible by the 68000.
Hardware:Interfaces
There are three interfaces to handle here. These are the parallel interface,
the serial interface, and the analogue inputs to the gameports (which can be
used for other uses apart from game paddles).
The parallel interface:this is primarily controlled via the CIAs, &
the data lines are coupled to PB7-PB0 of CIA-A (i.e., CIAAPRB, whose address
is $BFE101). The PC output of CIA-A is connected to the DATA READY signal of
the handshake line, and the FLAG pin to the DATA ACKNOWLEDGE signal. Since
data register B of each CIA is equipped with handshaking built in, whereby an
access (read or write) causes PC to go low for one clock cycle, writing to
CIAAPRB automatically sends out a DATA READY signal. When the connected dev-
ice responds, it pulls the line connected to FLAG low to signal DATA ACKNOW-
LEDGE, and the FLAG bit in the ICR is set. Because of this, it is possible to
process data output to the Centronics interface via an interrupt routine, &
allow programs to continue with other processing while the interrupt routine
handles the output - ideal for printer spooling, for example.
The handshaking process can also be used for data INPUT from the
Centronics interface (assuming a bidirectional perpipheral is connected!).
PC and FLAG are handled automatically by the CIA, and handling the FLAG int-
errupt is virtually all that the programmer needs to do under normal circum-
stances.
CIA-B is used for the SELECT and BUSY signals, bit 2 of CIABPRA (at
address $BFD000) being SELECT, and bit 0 being BUSY. The BUSY signal is used
for communication with slow peripherals (e.g., printers), and the interrupt
routine can also wait for the BUSY signal to change before continuing output
to a printer.
The serial interface:the serial interface is controlled by a combin-
ation of CIA registers and Paula registers. The CIA connections for the ser-
ial interface are:
/DTR signal : CIA-B PA7
/RTS signal : CIA-B PA6
/CD signal : CIA-B PA5
/CTS signal : CIA-B PA4
/DSR signal : CIA-B PA3
All of these signals are sent through inverter logic as part of the RS-232
driver hardware, and so the signals are active low. Setting the corresponding
CIA-B bit to 0 sets the corresponding RS-232 line to high. TAKE CAREFUL NOTE
OF THIS! Forgetting the inverse logic of this RS-232 interface is a common
source of interfacing problems.
When using RTS/CTS protocol, RTS should be made an output (set the
corresponding bit in CIABDDRA) and CTS an input (clear the appropriate DDRA
bit). I am not currently sure how to handle XON/XOFF, so until I have access
to the appropriate data, I shall leave XON/XOFF undocumented.
One feature of serial data transfer using RS-232 is that clock sig-
nals are not provided. This means that both sender and receiver must provide
their own timing, and that the times must match for secure data transfer. A
set of standard baud rates exist for RS-232, typical values being 300 baud,
1200, 2400, 4800 and 9600 baud. Some fast peripherals (e.g., the new modems
for mainframe communications using SYSTEM-X exchanges) can have a maximum
baud rate of 38400 baud (but since they're £5,000 each, few readers of this
DOC file will have one coupled to their Amigas...) and high baud rates are
also a feature of experimental computer-moderated radio transmissions using
short-wave radio.
The serial interface controller, or UART (this admonitive acronym
stands for 'Universal Asynchronous Receiver/Transmitter', which is a lot of
fun to try and say after several Bacardis) allows setting of the baud rate
in the SERPER register (custom chips, offset $032, write-only). This regis-
ter also controls the data length to some extent. Bit 15 (the LONG bit), if
set, makes the length of the receive data 9 bits instead of 8. The remain-
ing 15 bits determine the baud rate.
Baud rate determination is indirect. Again, the number used is the
number of bus cycles (just as for audio data sampling rates) taken to trans-
mit one byte of data. If it takes N bus cycles to transmit a byte, the number
N-1 must be written to SERPER (for some perverse reason). So the relationship
between baud rate and the value to write into SERPER (here designated as S)
is
S = (1 / (B * 2.79365 * 10E-7) ) - 1
where B is the baud rate, and 2.79365 * 10E-7 is the time taken for one bus
cycle (279.365 nanoseconds). So, for 4800 baud transmit/receive rate, the
value is
S = (1/4800 * 2.79365*10E-7) - 1
which is 744.738. Rounding to the nearest integer, we have 745. So to select
4800 baud we use
move.w #745,SERPER(a5)
for 8-bit data transfers, and
move.w #$8000+745,SERPER(a5)
for 9-bit receives.
Ok, we can now set the baud rate, and have access to the control
signals. The other registers we need are the Paula serial data registers
(Cross-reference #2):
Register Offset Function
-------- ------ --------
SERDAT 030 Contains data to send (RS-232 output)
SERDATR 018 Contains data to read (RS-232 input)
The SERDATR register for data reading (RS-232 input mode) has several bits
allocated to various functions. The bit assignments for SERDATR are:
Bit Name Function
--- ---- --------
15 OVRUN Overrun of receiver shift register if set
14 RBF Receive buffer full if set
13 TBE Transmit buffer empty if set
12 TSRE Transmit shift register empty if set
11 RXD Matches level on RXD line
10 Unused
9 STP Stop Bit Value
8 DB8 Depends on LONG in SERPER
7-0 DB7-0 Receive data buffer bits 7-0
SERDAT (offset $030) is used to contain the data to be transmitted from the
Amiga. Because of the time taken for serial data transfer using normal RS-232
protocols, there is no provision for a data buffer pointer and a DMA control
read/write system to automatically read in or write out a block of data of a
given size, as the need was felt not to exist by the designers. The maximum
possible data transfer rate corresponds to a SERPER value of zero, and equals
approximately 3,580,000 baud! Not a regularly selected baud rate...few appli-
cations need this kind of speed (usually confined to military systems, which
also possess inbuilt data encryption, reversible frequency modulated pertur-
bation of the data stream and other weird features not present on the Amiga)
and it is very unlikely that readers of this DOC file will ever need it.
Two interrupts are provided for handling serial transfers. The RBF
(Receive Buffer Full) interrupt handled via the IPL5 interrupt vector is the
interrupt used for handling RS-232 transfers from the outside world to the
Amiga, and the TBE (Transmit Buffer Empty) interrupt handled via the IPL1
interrupt vector is used to handle RS-232 transfers from the Amiga to the
outside world. Both interrupt vectors should be initialised appropriately and
the interrupt code should principally restrict itself to either sending or
receiving a byte of data. Configuring the baud rates and other allied func-
tions should be left to other routines called by the main program.
The procedures are as follows:
* Reading a byte of data from RS-232
* This is interrupt code, so put an RTE after it
* if using 'as is'. If using AmigaDos, there are
* other ways of doing it - see elsewhere.
* assumes ptr to custom chips in A5!!
move.w INTREQR(a5),d0 ;get interrupt request
bclr #11,d0 ;RBF interrupt?
beq.s not_RBF ;no
move.w SERDATR(a5),d0 ;get received data (clear RBF)
move.w d0,ser_word ;save it
move.w d0,INTREQ(a5) ;and acknowledge interrupt
;(see note below)
not_RBF ...
The RBF (Receive Buffer Full) bit is set in SERDATR and INTREQ/R whenever a
data word is transferred from Paula's internal shift register to the SERDATR
register. At this point, SERDATR should be read, to clear space for the next
incoming data word. Reading the next data word clears RBF, and signals that
SERDATR is ready to receive the next data word being read into Paula's int-
ernal shift register.
If SERDATR is not read, and the shift register has received another
complete data word, OVRUN is set. This signals that no more data can be rec-
eived because both SERDATR and the shift register contain inputted data. When
SERDATR is read under these conditions, OVRUN is reset, and RBF also. RBF is
then immediately set again and the full contents of the internal shift regis-
ter are loaded into SERDATR again, allowing more data to enter the shift reg-
ister. Obviously, once RBF is set again, the data must be read from SERDATR
again.
The format of the data to be read is determined by SERDATR and SER-
PER. If the LONG bit of SERPER is set, the data is 9 bits, else 8 bits. If
the data is 8 bits, then bits 9 and 10 mark the stop bits, if present, and
are set if there is a stop bit there. If the data is 9 bits, bit 10 is the
stop bit if present, again set if a stop bit encountered. Note that all data
is sent with one start bit, which always has the value 0. This applies both
to reception and transmission, and the hardware detects the end of a data
word by noting the transmission from the 1 of a stop bit to the 0 of the next
start bit. Note that when the data transmitter has finished sending its data
to the Amiga, the RBF bit will never be set after the last word has been pro-
cessed, and the interrupt routine will never be called from this point on.
Usually, RS-232 systems signal this, either by sending a data byte
or bytes at the start indicating the size of the data block being transmit-
ted, or by sending a special 'end of transmission' character at the end of
transmission. Two typical choices are CTRL-D (known as the ASCII EOT char-
acter) or CTRL-Z (EOF, or end of file, on many systems). If the transmission
consists of binary data instead of ASCII characters, then either the start
of the transmission must contain the byte count of the block, or else another
means of signalling end of transmission is required, as any of the control
characters could be a valid data byte, unless an encoding scheme is used.
* Writing a byte of data to RS-232
* Again,interrupt code, pointer to the
* custom chips in A5.
move.w INTREQR(a5),d0 ;get interrupt request
bclr #0,d0 ;TBE interrupt?
beq.s not_TBE ;no
move.w ser_word,d0 ;get data to send
move.w d0,SERDAT(a5) ;send it (clear TBE)
move.w d0,INTREQ(a5) ;acknowledge interrupt
not_TBE ...
The data to be output in this case is written to SERDAT. It is then immedia-
tely transferred to the output internal shift register. This is signalled by
the TBE bit, which is set to indicate that SERDAT is able to receive more
data. Once TBE is set, more data should be written to SERDAT to maintain the
data flow. Should this not occur, and the output shift register is emptied
before SERDAT is reloaded, then the TSRE (Transmit Shift Register Empty) bit
is set. This is cleared when SERDAT is loaded, as is TBE. But TBE is immedia-
tely set again as the contents of SERDAT are sent to the output shift regis-
ter, and TBE is cleared again, allowing more data to be written.
The format of the data to be sent is determined by SERDAT. The data
formats for different cases are:
8 bit data, 1 stop bit : 00000001 dddddddd
8 bit data, 2 stop bits : 00000011 dddddddd
9 bit data, 1 stop bit : 0000001d dddddddd
To stop transmission of data, one bit of the ADKCON register (offset $09E) is
provided, the UARTBRK bit (bit 11). Setting this bit using the instruction
move.w #$8800,ADKCON(a5)
(ADKCON uses the SETIT mechanism-see DMACON) stops serial data transfer and
clears TXD (the transmit data line) of the serial port.
Analogue inputs:the gameports possess two analogue inputs each, and
it is possible to connect game paddles to them or other analogue signal gene-
rating equipment. Game paddles usually use a sliding or twist knob to change
the resistance of a potentiometer (known as a 'pot' for short - hence the use
of POTxxxx register names for this interface!).
Analogue joysticks, with a potentiometer for the X and the Y direc-
tion, can also be connected. The values that these produce are read in the
POTxDAT registers (POT0DAT, offset $012, POT1DAT, offset $014). Bits 0-7 are
used for the X-value, and bits 8-15 for the Y-value.
Now, how does this work? Well, Paula contains a circuit to handle a
simple analogue-to-digital conversion. The requirement is that the maximum
resistance of the potentiometers should be 470 Kilohms (with a tolerance of
plus or minus ten percent). One side of the potentiometer is connected to the
5-volt power supply, and the other to one of the analogue inputs. These lead
internally to Paula and to a capacitor, one for each input, connected between
the input and ground.
The paddle outputs are placed briefly at ground, discharging the ca-
pacitors. Also, the counters in POTxDAT are cleared. For each raster line,
the counters are incremented by one while the capacitors are charged through
the resistors. When the voltage across the capacitor exceeds a preset value,
the corresponding counter is stopped. Thus the counter state is directly pro-
portional to the input resistance. Small values equal low resistances, large
values equal high resistances.
The POTGO register (Cross-reference #7) determines whether the ana-
logue pins are inputs or outputs. The bits are assigned as follows:
Bit Name Function
--- ---- --------
15 OUTRY 1=gameport 1 POTY bit is output, 0=input
14 DATRY Gameport 1 POTY data bit
13 OUTRX 1=gameport 1 POTX bit is output, 0=input
12 DATRX Gameport 1 POTX data bit
11 OUTLY 1=gameport 0 POTY bit is output, 0=input
10 DATLY Gameport 0 POTY data bit
9 OUTLX 1=gameport 0 POTX bit is output, 0=input
8 DATLX Gameport 0 POTX data bit
7-1 unused
0 START Discharge capacitors & begin analogue
measurement
A write access to POTGO (offset $034) clears both POTxDAT registers. POTGOR
also exists (offset $016) to allow the states to be read.
Normally, START is set to 1 at the start of the vertical blank int-
erval, and the valid potentiometer values can be read at the start of the
next VBL, immediately prior to setting START to 1 again.
If the corresponding OUTxx bit above is set to 1, the corresponding
line is treated as a digital output, and the corresponding DATxx bit is sent
out along it. If OUTxx=0, then DATxx in POTGOR yields the current state of
those lines as digial outputs.
Paddle buttons use the same bits as the joystick data registers (ho,
hum!). The assignments are:
Gameport 0 Gameport 1
---------- ----------
Left Button JOY0DAT bit 9 JOY1DAT bit 9
Right Button JOY0DAT bit 1 JOY1DAT bit 1
For each of these, the bit is 1 if the button is pressed.
Hardware:Mouse, Keyboard, Joysticks
I assume that most of those readers requiring this file for hardware documen-
tation know what a keyboard, a mouse and a joystick are. Some may even have
dismantled several of these items (I have-much to the dismay of those people
whose equipment I have dissected!) and found out about the inner workings of
the basic hardware. However, there exist extra components within these pieces
of hardware whose function needs a little extra explanation.
Modern keyboards are no longer simply a matrix of switches. Most of
the keyboards on modern computers have internal controller chips of their own
with their own RAM and ROM, allowing customisation and configuration of the
keyboard. IBM PCs (ugh!) use an Intel 8047 controller, Sinclair QLs (yeuck!)
use an Intel 8049, Atari STs (duh...) use a 6301 processor chip (this was at
one time the heart of expensive desktop computers!) and the Amiga? Well, the
Amiga uses the MOS 6500/1 processor. So what? Well, a 6500/1 is really a 6502
processor, as found in the PET/VIC-20/Commodore 64, with on-chip RAM and ROM.
The ROM is mask-programmed with the control program for the Amiga keyboard.
Cross-reference #1:the CIAs are coupled via several links to the 6500/1 for
keyboard communication.
So, if it wasn't for the mask programmed ROM, in theory anyone with
a 6502 assembler and the motivation could write his own keyboard controller
program for the Amiga. Sad to say, the existence of a control program embed-
ded in ROM, plus the one-way data traffic (keyboard to Amiga) makes this im-
possible. Atari ST keyboard controllers CAN be reprogrammed, but I warn any-
one tempted to try, it is HARD.
The 6500/1 has a 2K ROM, 64 bytes of static RAM, 4 bidirectional 8-
bit ports, a 16-bit counter with its own control input, and a clock generator
of its own. This chip is interfaced to the Amiga via two precision 556 timer
chips. These provide a reset signal for the Amiga. The mechanism by which the
reset mechanism is provided is interesting - more later.
The 6500/1 reads the keyboard matrix via ports C and D (the 6500/1
has four I/O ports on-chip) to obtain details about which key has been either
pressed or released. This information is converted from the bitwise port in-
formation, into a raw key code passed out to the Amiga via port A. This data
is transmitted serially, and the requisite line from the 6500/1 is connected
to the SP input line of CIA-A. The CNT line of the same CIA provides the key-
board system with the clock signal for signal synchronisation.
Having scanned the matrix, the 6500/1 returns raw key codes whenever
there is a state change in the keyboard. If a key is pressed, the code for
that key is sent. If that key is released, the code for that key, with bit 7
set, is sent to signal 'key released'. If a different key is pressed before
the release of the original key, the new keypress is sent first. The keyboard
events are sent in the order of occurrence where possible.
Special keys are wired into a different part of the matrix, so that
keycode clashes do not occur with the special keys SHIFT, CONTROL, ALT & the
two AMIGA keys. Both left and right SHIFT, left and right ALT, and left and
right AMIGA are given their own separate keycodes, making for a massive num-
ber of keycoding possibilities. CAPS LOCK also has its own key code, and is
treated somewhat differently. The 6500/1 simulates a push-button with this
key, no release information being supplied. The CAPS LOCK key state is only
judged to have changed when pressed - releasing the CAPS LOCK key is ignored
by the 6500/1. When pressed the first time, the LED lights up, and a code is
sent corresponding to 'key pressed'. The second time CAPS LOCK is pressed, at
which point the LED becomes unlit, the code for 'key released' (bit 7 set) is
sent. This is the only key treated thus - all other keys have their pressed/
released events handled in the normal way described above.
Key code groups are :
$00 - $3F : Normal ASCII letter keys in the main
keyboard section between the grey ENTER
and the CAPS LOCK key, plus the numeric
keypad keys except ENTER.
$40 - $4F : Codes of standard special keys, such as
the SPACE BAR, RETURN, TAB, BACKSPACE,
DEL, ESC, numeric keypad ENTER.
$50 - $5F : Function keys F1-F10, HELP.
$60 - $6F : SHIFT, ALT, CONTROL, AMIGA and CAPS LOCK
keys.
I shall supply the codes for the special keys individually. To work out the
other keys, it is possible to write a piece of code to scan the keyboard and
read the raw key codes. More of this later. The special key codes are :
Key Code
--- ----
LEFT SHIFT $60
RIGHT SHIFT $61
CAPS LOCK $62
CONTROL $63
LEFT ALT $64
RIGHT ALT $65
LEFT AMIGA $66
RIGHT AMIGA $67
There are also special codes sent for certain special functions. These are:
Keycode Function
------- --------
$F9 Last key code sent was incorrect
$FA Keyboard buffer of 6500/1 is full
$FC Error in keyboard self test (AGH! REPAIR TIME!)
$FD Start of keys held down on power up
$FE End of keys held down on power up
The $F9 code is sent if there has been disruption of the keyboard linkage. On
the A500, this usually means time to get the Amiga mended, but on A100/A2000
machines this can result from unplugging a keyboard & then plugging in a new
keyboard while the Amiga is switched on. This allows the keyboard resynchron-
isation system to be activated, re-establishing secure communications.
The 6500/1 has an internal character buffer of 10 characters. If the
buffer becomes full (because software is not reading it quickly enough), the
6500/1 sends the $FA code to indicate a full keyboard buffer, and that sub-
sequent keypresses will be lost.
Keyboard data communications are always conducted from the keyboard
to the Amiga. The Amiga sends handshaking signals to the keyboard and also a
clock signal for data transfer synchronisation.
The actual order in which the bits are sent is not the usual order
of 76543210, but 65432107. So when the data byte is received, it needs to be
rotated one bit position to obtain the true keycode. The CIA shift register
of CIA-A contains the data once read, and sends a level-2 interrupt once the
data has been received. The level-2 interrupt code should then read that data
byte, output the handshake pulse and save the received code somewhere safe to
be processed later by a user program.
Keycodes furthermore are inverted, because the circuitry is designed
as ACTIVE LOW. This means that a low voltage corresponds to a 1, and a high
voltage corresponds to a 0. To obtain the keycode, the bits must be inverted
back to the normal form.
Basically, the 6500/1 puts the data bits on its data line (KDAT),
plus a 20-microsecond low pulse on the clock line (KCLK). Between each of the
data pulses, 40-microsecond pauses are placed. Hence data transfer rate is 1
bit every 60 microseconds, or 16667 bits per second (16666 baud). After the
last data bit has been sent, the 6500/1 waits for a handshake pulse. This is
performed by the Amiga pulling KCLK low for 75 microseconds the moment that
the last data bit is received.
To handle the keyboard interface, the kind of code I would use looks
like this, on occasions when the operating system is being bypassed:
* complete level 2 interrupt handler for CIA-A
* to initialise CIA-A to activate this routine properly
* initialise with CIAAICR = $10, CIAACRA = $40 in the
* main program. This sets SPMOD = input,
* INMODE = CNT, generate interrupt when SP full.
* Don't forget to point IPL2 vector to this code as well!
ciaint move.w #$2700,sr ;lock out higher ints
;just in case
move.l d0,-(sp) ;keep this
move.w INTREQR(a5),d0
btst #3,d0 ;CIA interrupt?
beq.s ciaint_1 ;nope!
move.w #$0008,INTREQ(a5) ;system acknowledge IRQ
move.b CIAAICR,d0 ;check CIA-A interrupt reg
btst #7,d0 ;CIA interrupt proper?
beq.s ciaint_1 ;no, cock up somewhere so ignore
btst #3,d0 ;SP data full?
beq.s ciaint_1 ;no, cock up somewhere so ignore
addq.w #1,ciacount ;cia interrupt counter
move.b CIAASP,d0 ;get key code from keyboard
or.b #$40,CIAACRA ;set SPMODE=output (pulls
;KCLK low!)
not.b d0
ror.b #1,d0
move.b d0,rawkey ;proper raw key code
moveq #8,d0 ;wait for 75 microsecs
ciaint_0 subq.w #1,d0 ;while pulling KCLK low
bne.s ciaint_0
and.b #$BF,CIAACRA ;SPMODE=input again
ciaint_1 move.l (sp)+,d0
rte
ciacount dc.w 0
rawkey dc.b 0,0
Keyboard reset mechanism:this is managed by the 6500/1. Pressing the sequence
of keys CTRL-AMIGA-AMIGA causes a hard reset. The 6500/1 control program will
sense this sequence, and pull KCLK low for about 0.5 seconds. This tells the
reset circuit of the Amiga to generate a hard reset. After one or more of the
keys are released, the 6500/1 also undergoes a reset, rebooting its control
program from scratch, signalled by flashing of the CAPS LOCK LED. Since KCLK
is connected to the CNT pin of CIA-A, and the above interrupt routine shows a
way of pulling KCLK low for 75 microseconds, it needs little imagination to
see that increasing the delay will allow software generation of a hard reset!
Just set SPMODE = OUTPUT for CIA-A, and hang the processor. After 0.5 seconds
or more, the Amiga will reset!
Mouse handling:the mouse counters are part of Denise. There are two
registers, called JOY0DAT (offset $00A) and JOY1DAT (offset $00C) for each of
the gameports 0 and 1. Just to relieve the confusion, the back panel of the
A500 says 'Joystick port 1' and 'Joystick port 2'. Subtract 1 from the num-
bers to get the appropriate gameport counter. The high byte of each counter
counts the vertical pulse count from 0 to 255, the low byte the horizontal
pulse count from 0 to 255.
The mouse counters count 200 pulses per inch (about 79 pulses/cm).
This makes the counters overflow after the mouse has moved about 4 cm. To
counter this, word-wide counters should be set up in software, and the actual
gameport counters used to update these. This is normally done by the opera-
ting system during the vertical blank interrupt. The following code is an
extract from a vertical-blank interrupt routine that I have used to handle
mouse counters:
;read mouse during VBL interrupt. mousex/y = old value of
;counters, mouseh/v = horiz/vert movement
move.w JOY0DAT(a5),d0 ;get mouse counter
move.w d0,newmouse ;save for later
and.w #$FF,d0 ;x counter
sub.w mousex,d0 ;difference old-new
cmp.w #-127,d0 ;underflow?
bge.s vblmouse1 ;no
neg.w d0 ;else -255-diff
sub.w #255,d0
bra.s vblmouseh ;store it
vblmouse1 cmp.w #127,d0 ;overflow?
ble.s vblmouseh ;no
neg.w d0 ;else 255-diff
add.w #255,d0
vblmouseh move.w d0,mouseh ;store horiz difference >0 = right
move.w newmouse,d0 ;get saved mouse counter
lsr.w #8,d0 ;get vertical count
sub.w mousey,d0 ;difference old-new
cmp.w #-127,d0 ;underflow?
bge.s vblmouse2 ;no
neg.w d0 ;else -255-diff
sub.w #255,d0
bra.s vblmousev ;store it
vblmouse2 cmp.w #127,d0 ;overflow?
ble.s vblmousev ;no
neg.w d0 ;else 255-diff
add.w #255,d0
vblmousev move.w d0,mousev ;store vert difference >0 = down
moveq #0,d0
move.w newmouse,d0 ;get mouse counters
lsl.l #8,d0 ;split across 2 words
lsr.w #8,d0 ;isolate x
move.w d0,mousex ;save
swap d0 ;get y
move.w d0,mousey ;save
newmouse dc.w 0
mousex dc.w 0
mousey dc.w 0
mouseh dc.w 0
mousev dc.w 0
The algorithm used is as follows:the assumption is made that the mouse coun-
ters are not changed by more than 127 pulses between reads. Both old and new
values are maintained, and new compared with old. The value
diff = old - new
is calculated. If 0 < diff < 127, the mouse movement was either right or down
without overflow. If -127 < diff < 0, the mouse movement was either left or
up, without overflow. If diff > 127, movement was right or down, with a coun-
ter overflow. If diff < -127, movement was left or up, with a counter under-
flow. For overflow, the actual mouse movement is computed as 255-diff, while
for an underflow, the actual mouse movement is computed as -255-diff.
To reset the mouse counters, use the JOYTEST register (offset $036).
This register is unusual. The register bit allocation is as follows:
Y Y Y Y Y Y y y X X X X X X x x
The Y bits are the upper 6 bits of the vertical counter, and the X bits are
the upper 6 bits of the horizontal counter. The yy and xx bits are connected
directly to the mouse input signals, and are not located anywhere in memory.
So these values cannot be changed at all in software. JOYTEST has the effect
of resetting both sets of mouse counters, making JOY0DAT and JOY1DAT equal in
value. Whatever value is sent to JOYTEST is sent to both of the JOYxDAT regi-
sters.
The mouse buttons are handled separately. If the mouse is attached
to port 0, the signals occur as follows:
LEFT BUTTON : CIAAPRA, Bit 6
RIGHT BUTTON : DATLY of POTGO (#7)
MIDDLE BUTTON(*) : DATLX of POTGO (#7)
For game port 1, the signals are:
LEFT BUTTON : CIAAPRA, Bit 7
RIGHT BUTTON : DATRY of POTGO (#7)
MIDDLE BUTTON (*) : DATRX of POTGO (#7)
For all of these, a zero bit value means that the button is PRESSED.
Joysticks are handled in a similar way-they use the same counters & JOYTEST.
However, to sense the direction in which the joystick is moved, the software
algorithm differs. Basically, the following table holds:
Joystick Right : Bit 1 JOYxDAT = 1
Joystick Left : Bit 9 JOYxDAT = 1
Joystick Back : (Bit 1 EOR Bit 0) = 1
Joystick Forward : (Bit 9 EOR Bit 1) = 1
The following piece of code can be used to generate direction indicators for
the joystick (here called DX for left/right, DY for up/down):
move.w JOY0DAT(a5),d0 ;get joystick value
move.w d0,d1 ;& copy it
lsr.w #1,d1 ;shift copy left
moveq #0,d2 ;clear DX, DY
btst #1,d0 ;bit 1 set (right) ?
beq.s notright ;no
move.w #1,d2
notright btst #9,d0 ;bit 9 set (left) ?
beq.s notleft ;no
move.w #-1,d2
notleft swap d2
eor.w d0,d1
btst #0,d1 ;result 1 (back) ?
beq.s notback
move.w #-1,d2
notback btst #8,d1 ;result 1 (forward) ?
beq.s notfront
move.w #1,d2
notfront swap d2
Now D2 contains the value DX in the low word, DY in the high word, and the
programmer can handle this value ad lib. This method allows diagonal joystick
values to be managed (the example in the Amiga System Programmer's Guide does
not allow this) in a way that is useful for such things as games.
Note that to ensure diagonal values are read properly, it
might be prudent to embed the code above as a subroutine, save the value in
D2 in D3 after the first call, and then call the subroutine a few times more,
each time ORing the result of D2 into D3. This allows transient diagonal joy-
stick values to be strobed in case the joystick response is poor, a standard
trick used by games writers on systems with known poor joystick responses.
The joystick fire buttons for each port correspond to the left mouse
buttons in each case-the states of each are read from the same bits of the
same register (see mouse above).
Hardware:Some Notes
This file does NOT contain information about the Enhanced Chip Set (from here
on known as the ECS). The ECS is able to access a CHIP RAM range of 1MB as
opposed to the 512K of the standard chip set, and I assume this is achieved
by expanding the various pointer high word registers to 4 bits wide, making
a 20-bit address whole. Mind you, I have discovered that making logical ass-
umptions such as this about Commodore hardware can lead a programmer right
up shit creek without a paddle.
The ECS is also supposed to possess some other functional enhance-
ments. The exact details are not known to me at the time or writing. Anyone
possessing ECS information are requested to supply the requisite information
to the following address to allow me to maintain precise updates:
Dave Edwards
232 Hale Road
WIDNES
Cheshire
WA8 8QA
The same applies to other DOC files in this series, which are also obtain-
able by sending a blank disc plus SAE for return postage to
Mark Meany
1 Cromwell Road
Polygon
SOUTHAMPTON
Hants
SO1 2JH
marking your envelope "CLUB DISC 4" and enclosing a covering letter explain-
ing your requirements. Other useful files and software are also available on
the various Club Discs from Mark above, and any back copies may be obtained
(assuming that archive copies still exist) by marking your envelope with the
legend "Club Disc N" where N is the disc number. Don't forget to include an
SAE or better still a jiffy bag for return postage, as neither he nor I are
rich enough to provide a freepost service!
Hardware:Logic Tutorial
For the mathematically minded, I present a logic tutorial allowing derivation
of minterms via the mechanism of developing alternational normal schemata to
create the individual minterm components.
Conventions:In this tutorial, AB is used to represent A AND B, A + B
is used to represent A OR B, and b is used to represent NOT B. In an expres-
sion such as
ABC + ABc + AbC
the AND operation takes precedence over the OR operation, and the NOT opera-
tion applies only to that letter typed in lower case. The expression (NOT B)
AND (NOT C) is represented by
bc
and this list of conventions is thus complete.
Rules:The following laws apply to all logical expressions:
1) AB is the same as BA, and A + B is the same as B + A.
This is the Commutative law.
2) A(BC) and (AB)C are equivalent. Similarly, A + (B + C)
is equivalent to (A + B) + C. This is the Associative
law.
3) The expression A(B+C) expands to AB + AC. This is the
Distributive law of conjunction over alternation.
4) The logical AND of any single term with an expression
of the form (B + b) has no effect - this is the identity
operation (equivalent to mutiplying numbers by 1). So
A is equivalent to A(B+b) and further equivalent to
A(B+b)(C+c).
These fundamental laws apply to many mathematical systems other than Boolean
Algebra, in particular they apply to the arithmetic of integers, rational
numbers and real numbers.
Alternational Normal Schemata:This long-winded term describes any
logical expression which contains terms of the form ABC combined using the
OR operator. So the expression
AB + ABc + ABCD + ABcD + Abcd
is an alternational normal schema, whereas
(A+BC)D + ABCD
is not. The reason for the name is this:alternation is another name for the
OR operator in mathematical literature, and a normal schema is one in which
the format of the component terms obeys a strict set of rules.
A DEVELOPED alternational normal schema is one where all of the com-
ponents contain the same number of component variables. The creation of a de-
veloped alternational normal schema involves scanning the expression for the
term with the maximum number of variables, and expand all terms deficient in
variables until all terms match in number of variables. An example:
AB + C + b
becomes
AB(C+c) + (A+a)(B+b)C + (A+a)b(C+c)
and then
ABC + ABc + ABC + AbC + aBC + abC
+ AbC + Abc + abC + abc
Collecting together identical terms into single terms, and eliminating the
surplus terms, this becomes
ABC + ABc + AbC + Abc + aBC + abC + abc
This is a devleoped alternational normal schema. All of the terms have the
same number of variables, and the expression is made up of terms of the form
ABC, all combined by alternation or the logical OR operation.
Now, the connection between developed alternational schemata and
minterms is simple - they're one and the same. This tutorial is simply a dem-
onstration of a more formal method of derivation for those with the necessary
background. The example I have used to demonstrate the technique in action
was chosen deliberately to illustrate this connection. Basically, all that a
programmer does when picking minterms is an operation of the above kind, even
if using a more intuitive and less formal method.